mesa.git
9 years agost/mesa: properly handle u_upload_alloc failure
Ilia Mirkin [Sat, 5 Sep 2015 17:11:27 +0000 (13:11 -0400)]
st/mesa: properly handle u_upload_alloc failure

vbuf is never null. We want to make sure that a resource was allocated
for the vbuf, which is *vbuf.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
9 years agonouveau: don't mark full range as used on unmap with explicit flush
Ilia Mirkin [Thu, 2 Jul 2015 22:44:18 +0000 (18:44 -0400)]
nouveau: don't mark full range as used on unmap with explicit flush

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
9 years agonv50: avoid using inline vertex data submit when gl_VertexID is used
Ilia Mirkin [Mon, 24 Aug 2015 15:49:05 +0000 (11:49 -0400)]
nv50: avoid using inline vertex data submit when gl_VertexID is used

The hardware only generates vertexid when vertices come from a VBO. This
fixes:

  vertexid-drawelements
  vertexid-drawarrays

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
9 years agonv50: don't flush vertex arrays when index buffer changes
Ilia Mirkin [Sat, 4 Jul 2015 00:32:53 +0000 (20:32 -0400)]
nv50: don't flush vertex arrays when index buffer changes

The index buffer is fed in inline over a pushbuf. It's not related to
vertices or any caching that might be done on them.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
9 years agonv50: rebind bo to bufctx when invalidating idxbuf storage
Ilia Mirkin [Sat, 4 Jul 2015 00:16:48 +0000 (20:16 -0400)]
nv50: rebind bo to bufctx when invalidating idxbuf storage

There is nothing to be done on a dirty idxbuf, but the bo may have
changed, so we have to rebind it to the bufctx.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
9 years agonv50: clear buffer status on all vertex bufs, not just the first one
Ilia Mirkin [Fri, 3 Jul 2015 23:21:21 +0000 (19:21 -0400)]
nv50: clear buffer status on all vertex bufs, not just the first one

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
9 years agonv50: fix drawing from tfb, direct-to-pushbuf submits
Ilia Mirkin [Thu, 1 Jan 2015 11:09:59 +0000 (06:09 -0500)]
nv50: fix drawing from tfb, direct-to-pushbuf submits

The stride was being set to 0, which is illegal (and also non-sensical).
Also we must wait for the buffer to become available for reading as
otherwise a wrong value may be prefetched. Since we must wait for the
buffer anyways, and it's mapped and in GART, we may as well avoid the
annoyance of the indirect pushbuf submit.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
9 years agoi965: Remove base miplevel from sampler state.
Ben Widawsky [Fri, 4 Sep 2015 17:42:33 +0000 (10:42 -0700)]
i965: Remove base miplevel from sampler state.

Gen9 changes the meaning of this to coarse LOD quality mode. Although that's a
desirable thing to be setting, it doesn't match the gen8 behavior and this was
unintentional. More importantly, we don't ever use this field. So instead of
getting it "wrong" drop it entirely.

This is a respin of a patch which only [incorrectly] tried to address gen9.

Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agodocs: add news item and link release notes for 10.6.6
Emil Velikov [Fri, 4 Sep 2015 22:11:40 +0000 (23:11 +0100)]
docs: add news item and link release notes for 10.6.6

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
9 years agodocs: add sha256 checksums for 10.6.6
Emil Velikov [Fri, 4 Sep 2015 22:05:47 +0000 (23:05 +0100)]
docs: add sha256 checksums for 10.6.6

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit e3e2a3e0e581da39dcd9268951edb52f68916940)

9 years agodocs: add release notes for 10.6.6
Emil Velikov [Fri, 4 Sep 2015 21:16:07 +0000 (22:16 +0100)]
docs: add release notes for 10.6.6

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit 4b05739e9d718a48415270b95c0a73b56666c364)

9 years agollvmpipe: convert double to long long instead of unsigned long long
Oded Gabbay [Thu, 3 Sep 2015 16:00:26 +0000 (19:00 +0300)]
llvmpipe: convert double to long long instead of unsigned long long

round(val*dscale) produces a double result, as val and dscale are double.
However, LLVMConstInt receives unsigned long long, so there is an
implicit conversion from double to unsigned long long.
This is an undefined behavior. Therefore, we need to first explicitly
convert the round result to long long, and then let the compiler handle
conversion from that to unsigned long long.

This bug manifests itself in POWER, where all IMM values of -1 are being
converted to 0 implicitly, causing a wrong LLVM IR output.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
9 years agonv30: Implement color resolve for msaa
Hans de Goede [Thu, 3 Sep 2015 10:38:01 +0000 (12:38 +0200)]
nv30: Implement color resolve for msaa

Note this is not ideal. Since the sifm can only do source sizes upto
1024x1024 we end up using the blitter on nv4x, which is not that fast.

And on nv3x we end up using the cpu which is really slow.

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agonv30: Fix creation of scanout buffers
Hans de Goede [Wed, 12 Aug 2015 11:39:42 +0000 (13:39 +0200)]
nv30: Fix creation of scanout buffers

Scanout buffers on nv30 must always be non-swizzled and have special
width alignment constraints.

These constrains have been taken from the xf86-video-nouveau
src/nv_accel_common.c: nouveau_allocate_surface() function.

nouveau_allocate_surface() applies these width constraints only when a
tiled attribute is set, which it sets for all surfaces allocated via
dri, and this "tiling" is not the same as swizzling, scanout surfaces
must be linear / have a uniform_pitch or only complete garbage is shown.

This commit fixes dri3 on nv30 showing a garbled display, with dri3 the
scanout buffers are allocated by mesa, rather then by the ddx, and the
wrong stride of these buffers was causing the garbled display.

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agovc4: Initialize pack field of qreg to 0 in qir_get_temp
Boyan Ding [Wed, 26 Aug 2015 11:52:50 +0000 (19:52 +0800)]
vc4: Initialize pack field of qreg to 0 in qir_get_temp

This avoids generation of undefined packing in qir and qpu instructions,
fixing a lot of rendering errors.

Fixes 8b36d107fdd (vc4: Pack the unorm-packing bits into a src MUL
instruction when possible.)

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
9 years agoi965: Disallow PixelTransfer operations for tiled-memcpy TexImage/ReadPixels
Chris Wilson [Fri, 4 Sep 2015 18:02:28 +0000 (19:02 +0100)]
i965: Disallow PixelTransfer operations for tiled-memcpy TexImage/ReadPixels

The tiled memcpy fast paths perform a simple blit (with only a couple of
trivial pixel conversion routines) and do not accommodate PixelTransfer
operations. Therefore if any are set, fallback to the regular routines.
Note that PixelTransfer only applies to TexImage and ReadPixels, not to
GetTexImage.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
9 years agoi965/vec4: Don't unspill the same register in consecutive instructions
Iago Toral Quiroga [Fri, 4 Sep 2015 11:23:20 +0000 (13:23 +0200)]
i965/vec4: Don't unspill the same register in consecutive instructions

If we have spilled/unspilled a register in the current instruction, avoid
emitting unspills for the same register in the same instruction or consecutive
instructions following the current one as long as they keep reading the spilled
register. This should allow us to avoid emitting costy unspills that come with
little benefit to register allocation.

v2:
  - Apply the same logic when evaluating spilling costs (Curro).

v3:
  - Abstract the logic that decides if a register can be reused in a function.
    that can be used from both spill_reg and evaluate_spill_costs (Curro).

v4:
  - Do not disallow reusing scratch_reg in predicated reads (Curro).
  - Track if previous sources in the same instruction read scratch_reg (Curro).
  - Return prev_inst_read_scratch_reg at the end (Curro).
  - No need to explicitily skip scratch read/write opcodes in spill_reg (Curro).
  - Fix the comments explaining what happens when we hit an instruction that
    does not read or write scratch_reg (Curro)
  - Return true early when the current or previous instructions read
    scratch_reg with a compatible mask.

v5:
  - Do not return true early, the loop should not be expensive anyway
    and this adds more complexity (Curro).

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
9 years agoi965: Add a debug option for spilling everything in vec4 code
Iago Toral Quiroga [Thu, 23 Jul 2015 09:11:53 +0000 (11:11 +0200)]
i965: Add a debug option for spilling everything in vec4 code

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
9 years agodri/common: Tokenize driParseDebugString() argument before matching debug flags.
Francisco Jerez [Thu, 3 Sep 2015 12:20:04 +0000 (15:20 +0300)]
dri/common: Tokenize driParseDebugString() argument before matching debug flags.

Fixes debug string parsing when one of the supported flags is a
substring of another.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
9 years agodri/common: Fix codestyle of driParseDebugString().
Francisco Jerez [Thu, 3 Sep 2015 11:50:12 +0000 (14:50 +0300)]
dri/common: Fix codestyle of driParseDebugString().

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
9 years agoglsl: error out on ES 3.1 if VS or FS present but not both
Tapani Pälli [Thu, 3 Sep 2015 11:26:48 +0000 (14:26 +0300)]
glsl: error out on ES 3.1 if VS or FS present but not both

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agoglsl: error on linking if no shaders are attached to program
Tapani Pälli [Thu, 3 Sep 2015 11:20:46 +0000 (14:20 +0300)]
glsl: error on linking if no shaders are attached to program

This applies to OpenGL Core >= 4.5 and OpenGL ES >= 3.1.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agoi965: Improve disassembly of data port read messages.
Kenneth Graunke [Thu, 13 Aug 2015 21:52:55 +0000 (14:52 -0700)]
i965: Improve disassembly of data port read messages.

We now print out the name of the message instead of its numerical
value, and label the message control and surface numbers.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
9 years agoi965: Optimize VUE map comparisons.
Kenneth Graunke [Fri, 15 May 2015 17:08:19 +0000 (10:08 -0700)]
i965: Optimize VUE map comparisons.

The entire VUE map is computed based on the slots_valid bitfield;
calling brw_compute_vue_map on the same bitfield will return the
same result.  So we can simply compare those.

struct brw_vue_map is 136 bytes; doing a single 8-byte comparison is
much cheaper and should work just as well.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
9 years agoi965/gs: Don't reserve space for clip plane uniforms.
Kenneth Graunke [Sat, 29 Aug 2015 06:47:25 +0000 (23:47 -0700)]
i965/gs: Don't reserve space for clip plane uniforms.

These were only for legacy userclipping, which we no longer support
in geometry shaders.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
9 years agoi965: Don't do legacy userclipping in non-compatibility contexts.
Kenneth Graunke [Fri, 28 Aug 2015 08:43:23 +0000 (01:43 -0700)]
i965: Don't do legacy userclipping in non-compatibility contexts.

According to the GLSL 1.50 specification, page 76:
"The shader must also set all values in gl_ClipDistance that have been
 enabled via the OpenGL API, or results are undefined."

With this patch, we only enable clip distance writes when the shader
actually writes them.  We no longer force a value to be written when
clip planes are enabled in the API.  This could mean the first varying
slot would be used as clip distances - I believe it should be the safe
kind of undefined behavior.

Empirically, it doesn't seem to cause a problem.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
9 years agoi965: Remove the brw_vue_prog_key base class.
Kenneth Graunke [Fri, 28 Aug 2015 01:24:39 +0000 (18:24 -0700)]
i965: Remove the brw_vue_prog_key base class.

The legacy userclip fields are only used for the vertex shader, and at
that point there's only program_string_id and the tex struct, which are
common to all keys.  So there's no need for a "VUE" key base class.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
9 years agoi965: Virtualize vec4_visitor::emit_urb_slot().
Kenneth Graunke [Fri, 28 Aug 2015 07:29:05 +0000 (00:29 -0700)]
i965: Virtualize vec4_visitor::emit_urb_slot().

This avoids a downcast of key, which won't exist in the base class soon.

I'm not a huge fan of this patch, but given that we're currently using
inheritance, this seems like the "right" way to do it.  The alternative
is to make key a void pointer in the parent class and continue
downcasting.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
9 years agoi965: Store a key_tex pointer in vec4_visitor.
Kenneth Graunke [Fri, 28 Aug 2015 06:55:28 +0000 (23:55 -0700)]
i965: Store a key_tex pointer in vec4_visitor.

I'm about to remove the base class for VS/GS/HS/DS program keys, at
which point we won't be able to use key->tex anymore.  Instead, we'll
need to store a direct pointer (like we do in the FS backend).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
9 years agoi965: Move legacy clip plane handling to vec4_vs_visitor.
Kenneth Graunke [Fri, 28 Aug 2015 06:49:03 +0000 (23:49 -0700)]
i965: Move legacy clip plane handling to vec4_vs_visitor.

This is now only used for the vertex shader, so it makes sense to get it
out of any paths run by the geometry shader.

Instead of passing the gl_clip_plane array into the run() method (which
is shared among all subclasses), we add it as a vec4_vs_visitor
constructor parameter.  This eliminates the bogus NULL parameter in the
GS case.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
9 years agoi965: Delete the brw_vue_program_key::userclip_active flag.
Kenneth Graunke [Fri, 28 Aug 2015 00:02:27 +0000 (17:02 -0700)]
i965: Delete the brw_vue_program_key::userclip_active flag.

There are two uses of this flag.

The primary use is checking whether we need to emit code to convert
legacy gl_ClipVertex/gl_Position clipping to clip distances.  In this
case, we also have to upload the clip planes as uniforms, which means
setting nr_userclip_plane_consts to a positive value.  Checking if it's
> 0 works for detecting this case.

Gen4-5 also wants to know whether we're doing clipping at all, so it can
emit user clip flags.  Checking if output_reg[VARYING_SLOT_CLIP_DIST0]
is set to a real register suffices for this.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
9 years agoi965: Remove legacy clip plane handling from geometry shaders.
Kenneth Graunke [Thu, 27 Aug 2015 21:04:40 +0000 (14:04 -0700)]
i965: Remove legacy clip plane handling from geometry shaders.

We only support geometry shaders in core profiles, where gl_ClipVertex
doesn't exist.  Presumably the even older behavior of clipping to
gl_Position isn't supported either.  In fact, GLSL 1.50 page 76 claims:

"The shader must also set all values in gl_ClipDistance that have been
 enabled via the OpenGL API, or results are undefined."

So we don't need to handle legacy clipping in geometry shaders.  I think
Paul added this back when we were considering supporting the old
GL_ARB_geometry_shader4 extension.

This removes a non-orthagonal state dependency on GS compilation.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
9 years agoi965: Move brw_setup_tex_for_precompile to brw_program.[ch].
Kenneth Graunke [Fri, 28 Aug 2015 01:27:20 +0000 (18:27 -0700)]
i965: Move brw_setup_tex_for_precompile to brw_program.[ch].

This living in brw_fs.{h,cpp} is a historical artifact of us supporting
texturing for fragment shaders before any other stages.  It's kind of
awkward given that we use it for all stages.

This avoids having to include brw_fs.h in geometry shader code in order
to access this function.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
9 years agomesa: change 'SHADER_SUBST' facility to work with env variables
Tapani Pälli [Mon, 31 Aug 2015 06:54:23 +0000 (09:54 +0300)]
mesa: change 'SHADER_SUBST' facility to work with env variables

Patch modifies existing shader source and replace functionality to work
with environment variables rather than enable dumping on compile time.
Also instead of _mesa_str_checksum, _mesa_sha1_compute is used to avoid
collisions.

Functionality is controlled via two environment variables:

MESA_SHADER_DUMP_PATH - path where shader sources are dumped
MESA_SHADER_READ_PATH - path where replacement shaders are read

v2: cleanups, add strerror if fopen fails, put all functionality
    inside HAVE_SHA1 since sha1 is required

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Suggested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agobuild: add HAVE_SHA1 define when using --with-sha1 option
Tapani Pälli [Thu, 3 Sep 2015 05:34:42 +0000 (08:34 +0300)]
build: add HAVE_SHA1 define when using --with-sha1 option

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Brian Paul <brianp@vmware.com>
9 years agoi965: Fix copy propagation type changes.
Kenneth Graunke [Wed, 2 Sep 2015 23:39:27 +0000 (16:39 -0700)]
i965: Fix copy propagation type changes.

commit 472ef9a02f2e5c5d0caa2809cb736a0f4f0d4693 introduced code to
change the types of SEL and MOV instructions for moves that simply
"copy bits around".  It didn't account for type conversion moves,
however.  So it would happily turn this:

   mov(8) vgrf6:D, -vgrf5:D
   mov(8) vgrf7:F, vgrf6:UD

into this:

   mov(8) vgrf6:D, -vgrf5:D
   mov(8) vgrf7:D, -vgrf5:D

which erroneously drops the conversion to float.

Cc: "11.0 10.6" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agor600: fix loop overrun in cayman_mul_double_instr
Dave Airlie [Thu, 3 Sep 2015 22:02:14 +0000 (08:02 +1000)]
r600: fix loop overrun in cayman_mul_double_instr

Coverity warned about this. Ilia pointed it out.

Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agoi965/gen9: Annotate input coverage mask change
Ben Widawsky [Wed, 26 Aug 2015 23:35:40 +0000 (16:35 -0700)]
i965/gen9: Annotate input coverage mask change

As far as I can tell, the behavior is preserved from the previous generations.
Before we set a single bit to tell the FS whether or not we'll be using an input
coverage mask. Now we have some options which are implementing various
extensions. These bits are used for the various conservative rasterization
mechanisms (for collision detection, binning, and whatever else).

I believe that the behavior is preserved because the problem which conservative
rasterization is attempting to fix would go away with the "NORMAL" mode (at the
cost of performance, I believe).

This patch serves as documentation of the change by creating the enums, as well
as giving some of the history with the links here so that the next person who
comes along and looks at it doesn't spend as long as I had to in order to
determine if there is an issue or not.

Previously, this algorithm had been done in software, and this can still be used
as long as we don't export an extension stating otherwise.

References: https://www.opengl.org/registry/specs/NV/conservative_raster.txt
References: https://http.developer.nvidia.com/GPUGems2/gpugems2_chapter42.html
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agosvga: update call to u_upload_alloc()
Brian Paul [Thu, 3 Sep 2015 17:23:36 +0000 (11:23 -0600)]
svga: update call to u_upload_alloc()

u_upload_alloc() no longer returns a return value.

Trivial.

9 years agowinsys/radeon: remove exported buffers from the cache
Marek Olšák [Tue, 1 Sep 2015 02:14:43 +0000 (04:14 +0200)]
winsys/radeon: remove exported buffers from the cache

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
9 years agowinsys/amdgpu: remove exported buffers from the cache
Marek Olšák [Tue, 1 Sep 2015 02:14:33 +0000 (04:14 +0200)]
winsys/amdgpu: remove exported buffers from the cache

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
9 years agogallium/pb_bufmgr_cache: add a way to remove buffers from the cache explicitly
Marek Olšák [Tue, 1 Sep 2015 02:07:54 +0000 (04:07 +0200)]
gallium/pb_bufmgr_cache: add a way to remove buffers from the cache explicitly

This must be done before exporting a buffer as dmabuf fds, because
we lose track of who is using it and can't trust the reference counter.

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
9 years agou_upload_mgr: remove the return value from u_upload_data
Marek Olšák [Wed, 2 Sep 2015 13:11:40 +0000 (15:11 +0200)]
u_upload_mgr: remove the return value from u_upload_data

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agou_upload_mgr: remove the return value from u_upload_buffer
Marek Olšák [Wed, 2 Sep 2015 13:11:40 +0000 (15:11 +0200)]
u_upload_mgr: remove the return value from u_upload_buffer

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agou_upload_mgr: remove the return value from u_upload_alloc_buffer
Marek Olšák [Wed, 2 Sep 2015 13:11:40 +0000 (15:11 +0200)]
u_upload_mgr: remove the return value from u_upload_alloc_buffer

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agou_upload_mgr: remove the return value from u_upload_alloc
Marek Olšák [Wed, 2 Sep 2015 13:08:23 +0000 (15:08 +0200)]
u_upload_mgr: remove the return value from u_upload_alloc

The return buffer or the returned pointer can be used instead.

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agou_upload_mgr: optimize u_upload_alloc
Marek Olšák [Wed, 2 Sep 2015 12:57:55 +0000 (14:57 +0200)]
u_upload_mgr: optimize u_upload_alloc

This is probably the most called util function. It does almost nothing,
yet it can consume 10% of the CPU on the profile. This drops it down to 5%.

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agogallium/radeon: remove 'dirty' member from r600_atom
Grazvydas Ignotas [Wed, 2 Sep 2015 22:54:32 +0000 (01:54 +0300)]
gallium/radeon: remove 'dirty' member from r600_atom

It's no longer used by both r600 and radeonsi now.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
9 years agor600g: simplify dirty atom tracking
Grazvydas Ignotas [Wed, 2 Sep 2015 22:54:31 +0000 (01:54 +0300)]
r600g: simplify dirty atom tracking

Now that R600_NUM_ATOMS is below 64, dirty atom tracking can be
simplified.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
9 years agor600g: start numbering atoms from 1
Grazvydas Ignotas [Wed, 2 Sep 2015 22:54:30 +0000 (01:54 +0300)]
r600g: start numbering atoms from 1

There doesn't seem any reason to start from 4.
Start from 1 instead (0 is left reserved to catch uninitialized atoms).

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
9 years agor600g: make all viewport states use single atom
Grazvydas Ignotas [Wed, 2 Sep 2015 22:54:29 +0000 (01:54 +0300)]
r600g: make all viewport states use single atom

Similarly to scissor states, we can use single atom to track all viewport
states. This will allow to simplify dirty atom handling later.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
9 years agor600g: apply disable workaround on all scissors
Grazvydas Ignotas [Wed, 2 Sep 2015 22:54:28 +0000 (01:54 +0300)]
r600g: apply disable workaround on all scissors

During review of the "r600g: make all scissor states use single atom" patch
Marek Olšák noticed that scissor disable workaround should be applied on
all scissor states and not just first one, so let's do so.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
9 years agor600g: make all scissor states use single atom
Grazvydas Ignotas [Wed, 2 Sep 2015 22:54:27 +0000 (01:54 +0300)]
r600g: make all scissor states use single atom

As suggested by Marek Olšák, we can use single atom to track all scissor
states. This will allow to simplify dirty atom handling later.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
9 years agomesa/pbo: Handle zero width, height or depth when validating access
Neil Roberts [Wed, 2 Sep 2015 10:29:16 +0000 (11:29 +0100)]
mesa/pbo: Handle zero width, height or depth when validating access

It's legal to call glTexSubImage with zero values for the width,
height or depth. Previously this was breaking the PBO access
validation because it tries to work out the last pixel accessed by
getting the pixel at height-1 and depth-1 which would end up with
bogus values.

This was causing GL errors to be generated during the Piglit
texsubimage test, although the test was passing anyway.

v2: Also check for width == 0. Don't validate the start pointer if any
    of the dimensions are zero.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agoglsl: Remove unused total_attribs_size variable.
Kenneth Graunke [Thu, 3 Sep 2015 07:55:40 +0000 (00:55 -0700)]
glsl: Remove unused total_attribs_size variable.

Accidentally left behind by my previous patch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoglsl: Handle attribute aliasing in attribute storage limit check.
Kenneth Graunke [Wed, 2 Sep 2015 17:42:57 +0000 (10:42 -0700)]
glsl: Handle attribute aliasing in attribute storage limit check.

In various versions of OpenGL and GLSL, it's possible to declare
multiple VS input variables with aliasing attribute locations.

So, when computing the storage requirements for vertex attributes,
we can't simply add up the sizes.  Instead, we need to look at the
enabled slots.

This patch begins tracking which attributes are double types that
are larger than 128-bits (i.e. take up two vec4 slots).  We then
count normal attributes once, and count the double-size attributes
a second time.

Fixes deQP functional.attribute_location.bind_aliasing.max_cond_* tests
on i965, which regressed with commit ad208d975a6d3aebe14f7c2c16039ee20.

No Piglit changes on llvmpipe (which actually supports dvecs).

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965/meta: Fix typo in comment
Ian Romanick [Wed, 2 Sep 2015 00:42:31 +0000 (17:42 -0700)]
i965/meta: Fix typo in comment

Trivial.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agomesa: Don't allow wrong type setters for matrix uniforms
Ian Romanick [Tue, 1 Sep 2015 01:44:42 +0000 (18:44 -0700)]
mesa: Don't allow wrong type setters for matrix uniforms

Previously we would allow glUniformMatrix4fv on a dmat4 and
glUniformMatrix4dv on a mat4.  Both are illegal.  That later also
overwrites the storage for the mat4 and causes bad things to happen.

Should fix the (new) arb_gpu_shader_fp64-wrong-type-setter piglit test.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
Cc: Dave Airlie <airlied@redhat.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
9 years agomesa: Pass the type to _mesa_uniform_matrix as a glsl_base_type
Ian Romanick [Tue, 1 Sep 2015 01:30:48 +0000 (18:30 -0700)]
mesa: Pass the type to _mesa_uniform_matrix as a glsl_base_type

This matches _mesa_uniform, and it enables the bug fix in the next
patch.

v2: s/type/basicType/ in the assert in _mesa_uniform_matrix.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au> [v1]
Cc: Dave Airlie <airlied@redhat.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
9 years agomesa: Silence unused parameter warnings in bufferobj.c
Ian Romanick [Wed, 26 Aug 2015 12:50:04 +0000 (13:50 +0100)]
mesa: Silence unused parameter warnings in bufferobj.c

main/bufferobj.c: In function 'count_buffer_size':
main/bufferobj.c:520:26: warning: unused parameter 'key' [-Wunused-parameter]
 count_buffer_size(GLuint key, void *data, void *userData)
                          ^
main/bufferobj.c: In function 'flush_mapped_buffer_range_fallback':
main/bufferobj.c:740:56: warning: unused parameter 'index' [-Wunused-parameter]
                                    gl_map_buffer_index index)
                                                        ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agomesa: Remove target parameter from _mesa_handle_bind_buffer_gen
Ian Romanick [Wed, 26 Aug 2015 12:55:54 +0000 (13:55 +0100)]
mesa: Remove target parameter from _mesa_handle_bind_buffer_gen

main/bufferobj.c: In function '_mesa_handle_bind_buffer_gen':
main/bufferobj.c:915:37: warning: unused parameter 'target' [-Wunused-parameter]
                              GLenum target,
                                     ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agoi965: Make gen7_enable_hw_binding_tables static
Ian Romanick [Wed, 19 Aug 2015 21:25:48 +0000 (14:25 -0700)]
i965: Make gen7_enable_hw_binding_tables static

All of the other state upload functions are static because the only use
is in the brw_tracked_state structure.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
9 years agoi965: Make gen8_upload_state_base_address static
Ian Romanick [Wed, 19 Aug 2015 20:54:21 +0000 (13:54 -0700)]
i965: Make gen8_upload_state_base_address static

All of the other state upload functions are static because the only use
is in the brw_tracked_state structure.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
9 years agolinker: Silence GCC unused parameter warnings
Ian Romanick [Wed, 19 Aug 2015 20:36:22 +0000 (13:36 -0700)]
linker: Silence GCC unused parameter warnings

linker.cpp:320:55: warning: unused parameter 'ir' [-Wunused-parameter]
    virtual ir_visitor_status visit_leave(ir_function *ir)
                                                       ^
linker.cpp:327:53: warning: unused parameter 'ir' [-Wunused-parameter]
    virtual ir_visitor_status visit_leave(ir_return *ir)
                                                     ^
linker.cpp:333:49: warning: unused parameter 'ir' [-Wunused-parameter]
    virtual ir_visitor_status visit_enter(ir_if *ir)
                                                 ^
linker.cpp:339:49: warning: unused parameter 'ir' [-Wunused-parameter]
    virtual ir_visitor_status visit_leave(ir_if *ir)
                                                 ^
linker.cpp:345:51: warning: unused parameter 'ir' [-Wunused-parameter]
    virtual ir_visitor_status visit_enter(ir_loop *ir)
                                                   ^
linker.cpp:351:51: warning: unused parameter 'ir' [-Wunused-parameter]
    virtual ir_visitor_status visit_leave(ir_loop *ir)
                                                   ^
linker.cpp:2824:53: warning: unused parameter 'ctx' [-Wunused-parameter]
 link_calculate_subroutine_compat(struct gl_context *ctx, struct gl_shader_program *prog)
                                                     ^
linker.cpp:2854:47: warning: unused parameter 'ctx' [-Wunused-parameter]
 check_subroutine_resources(struct gl_context *ctx, struct gl_shader_program *prog)
                                               ^
linker.cpp:3368:49: warning: unused parameter 'ctx' [-Wunused-parameter]
 link_assign_subroutine_types(struct gl_context *ctx,
                                                 ^

Also make link_assign_subroutine_types static since it is only called
from this file.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agomesa: Fix warning about static being in the wrong place
Ian Romanick [Wed, 19 Aug 2015 00:41:30 +0000 (17:41 -0700)]
mesa: Fix warning about static being in the wrong place

Because the compiler already has enough things to complain about.

    grep -rl 'const static' src/ | while read f
    do
        sed --in-place -e 's/const static/static const/g' $f
    done

brw_eu_emit.c: In function 'brw_reg_type_to_hw_type':
brw_eu_emit.c:98:7: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
       const static int imm_hw_types[] = {
       ^
brw_eu_emit.c:120:7: warning: 'static' is not at beginning of declaration [-Wold-style-declaration]
       const static int hw_types[] = {
       ^

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agoi965/cs: Setup push constant data for uniforms
Jordan Justen [Tue, 23 Sep 2014 23:46:39 +0000 (16:46 -0700)]
i965/cs: Setup push constant data for uniforms

brw_upload_cs_push_constants was based on gen6_upload_push_constants.

v2:
 * Add FINISHME comments about more efficient ways to push uniforms

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
9 years agometa: Save/restore compute shaders
Jordan Justen [Mon, 25 May 2015 19:23:05 +0000 (12:23 -0700)]
meta: Save/restore compute shaders

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agosvga: fix referencing a NULL framebuffer cbuf
Charmaine Lee [Fri, 21 Aug 2015 18:41:26 +0000 (11:41 -0700)]
svga: fix referencing a NULL framebuffer cbuf

Check for a valid framebuffer cbuf pointer before accessing its
associated surface.

Fix piglit test fbo-drawbuffers-none.

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agosvga: increment texture age when surface is to be marked as dirty
Charmaine Lee [Fri, 21 Aug 2015 17:36:24 +0000 (10:36 -0700)]
svga: increment texture age when surface is to be marked as dirty

Commit b9ba8492 removes an unneeded pipe_surface_release() from
st_render_texture(). This implies a surface can now be reused for a
render buffer. Currently, when we render to a texture, we mark the
surface as dirty. But in svga_mark_surface_dirty(), if the surface
is already marked as dirty, it does not increment the texture age.
Any view to this texture might not be updated properly then.

With this patch, the texture age is incremented regardless of whether
the surface is already marked as dirty or not.

Fix bug 1499181.

Reviewed-by: Sinclair Yeh <syeh@vmware.com>
9 years agosvga: fix backed surface view regression
Charmaine Lee [Thu, 13 Aug 2015 22:08:22 +0000 (15:08 -0700)]
svga: fix backed surface view regression

Commit b9ba8492 removes an unneeded pipe_surface_release() from
st_render_texture() and exposes a bug in the backed surface view
creation.  Currently a backed surface view for a conflicted surface view
is created at framebuffer emit time. But if shader sampler views are changed
but framebuffer surface views remain unchanged, emit_framebuffer() will not
be called and conflicted surface views will not be detected.

To fix this, also check for conflicted surface views when setting sampler
views. If there is any conflicted surface views, enable the
framebuffer dirty bit so that the framebuffer emit code has a chance to
create a backed surface view for the conflicted surface view.

Fix cinebench-r11-test regression.

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agoi965/fs: Handle MRF destinations in lower_integer_multiplication().
Matt Turner [Wed, 2 Sep 2015 05:00:24 +0000 (22:00 -0700)]
i965/fs: Handle MRF destinations in lower_integer_multiplication().

The lowered code reads from the destination, which isn't possible from
message registers.

Fixes the following dEQP tests on SNB:

    dEQP-GLES3.functional.shaders.precision.int.highp_mul_fragment
    dEQP-GLES3.functional.shaders.precision.int.mediump_mul_fragment
    dEQP-GLES3.functional.shaders.precision.int.lowp_mul_fragment

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agodocs: document VMware OpenGL 3.3 support
Brian Paul [Thu, 13 Aug 2015 20:50:13 +0000 (13:50 -0700)]
docs: document VMware OpenGL 3.3 support

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: update driver for version 10 GPU interface
Brian Paul [Thu, 13 Aug 2015 18:00:58 +0000 (11:00 -0700)]
svga: update driver for version 10 GPU interface

This is a squash commit of roughly two years of development work.
Authors include:
  Brian Paul
  Charmaine Lee
  Thomas Hellstrom
  Jakob Bornecrantz
  Sinclair Yeh
  Mingcheng Chen
  Kai Ninomiya
  MengLin Wu

The driver supports OpenGL 3.3.

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new version 10 device command prototypes
Brian Paul [Fri, 7 Aug 2015 21:41:17 +0000 (15:41 -0600)]
svga: add new version 10 device command prototypes

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_streamout.h file
Brian Paul [Fri, 7 Aug 2015 21:23:51 +0000 (15:23 -0600)]
svga: add new svga_streamout.h file

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_state_tgsi_transform.c file
Brian Paul [Fri, 7 Aug 2015 22:04:03 +0000 (16:04 -0600)]
svga: add new svga_state_tgsi_transform.c file

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_state_sampler.c file
Brian Paul [Fri, 7 Aug 2015 21:22:18 +0000 (15:22 -0600)]
svga: add new svga_state_sampler.c file

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_state_gs.c file
Brian Paul [Fri, 7 Aug 2015 21:22:01 +0000 (15:22 -0600)]
svga: add new svga_state_gs.c file

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_pipe_streamout.c file
Brian Paul [Fri, 7 Aug 2015 21:21:46 +0000 (15:21 -0600)]
svga: add new svga_pipe_streamout.c file

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_pipe_gs.c file
Brian Paul [Fri, 7 Aug 2015 21:21:29 +0000 (15:21 -0600)]
svga: add new svga_pipe_gs.c file

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_link.[ch] files
Brian Paul [Fri, 7 Aug 2015 21:21:10 +0000 (15:21 -0600)]
svga: add new svga_link.[ch] files

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_cmd_vgpu10.c file
Brian Paul [Fri, 7 Aug 2015 20:57:22 +0000 (14:57 -0600)]
svga: add new svga_cmd_vgpu10.c file

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_tgsi_vgpu10.c file
Brian Paul [Fri, 7 Aug 2015 20:56:51 +0000 (14:56 -0600)]
svga: add new svga_tgsi_vgpu10.c file

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: remove unused SVGA3D_* command functions
Brian Paul [Fri, 7 Aug 2015 22:11:14 +0000 (16:11 -0600)]
svga: remove unused SVGA3D_* command functions

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agogallium/st: add pipe_context::get_timestamp()
Brian Paul [Fri, 7 Aug 2015 20:54:24 +0000 (14:54 -0600)]
gallium/st: add pipe_context::get_timestamp()

The VMware svga driver doesn't directly support pipe_screen::get_timestamp()
but we can do a work-around.  However, we need a gallium context to do so.
This patch adds a new pipe_context::get_timestamp() function that will only
be called if the pipe_screen::get_timestamp() function is NULL.

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga/winsys: Add support for VGPU10
Brian Paul [Thu, 6 Aug 2015 22:44:35 +0000 (16:44 -0600)]
svga/winsys: Add support for VGPU10

This involves a few driver modifications to keep things building.
The driver may not actually run properly at this point.

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: update the svga3d device header files
Brian Paul [Thu, 6 Aug 2015 22:28:19 +0000 (16:28 -0600)]
svga: update the svga3d device header files

Remove some obsolete svga_dump.c code for items which no longer exist.

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new version 10 device header files
Brian Paul [Fri, 7 Aug 2015 20:56:03 +0000 (14:56 -0600)]
svga: add new version 10 device header files

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agowinsys/svga: add new vmw_query.c[h] files
Brian Paul [Wed, 29 Jul 2015 17:23:29 +0000 (11:23 -0600)]
winsys/svga: add new vmw_query.c[h] files

Functions for creating, destroying, getting queries, etc.

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agometa: Compute correct buffer size with SkipRows/SkipPixels
Chris Wilson [Tue, 1 Sep 2015 08:31:15 +0000 (09:31 +0100)]
meta: Compute correct buffer size with SkipRows/SkipPixels

If the user is specifying a subregion of a buffer using SKIP_ROWS and
SKIP_PIXELS, we must compute the buffer size carefully as the end of the
last row may be much shorter than stride*image_height*depth. The current
code tries to memcpy from beyond the end of the user data, for example
causing:

==28136== Invalid read of size 8
==28136==    at 0x4C2D94E: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:915)
==28136==    by 0xB4ADFE3: brw_bo_write (brw_batch.c:1856)
==28136==    by 0xB5B3531: brw_buffer_data (intel_buffer_objects.c:208)
==28136==    by 0xB0F6275: _mesa_buffer_data (bufferobj.c:1600)
==28136==    by 0xB0F6346: _mesa_BufferData (bufferobj.c:1631)
==28136==    by 0xB37A1EE: create_texture_for_pbo (meta_tex_subimage.c:103)
==28136==    by 0xB37A467: _mesa_meta_pbo_TexSubImage (meta_tex_subimage.c:176)
==28136==    by 0xB5C8D61: intelTexSubImage (intel_tex_subimage.c:195)
==28136==    by 0xB254AB4: _mesa_texture_sub_image (teximage.c:3654)
==28136==    by 0xB254C9F: texsubimage (teximage.c:3712)
==28136==    by 0xB2550E9: _mesa_TexSubImage2D (teximage.c:3853)
==28136==    by 0x401CA0: UploadTexSubImage2D (teximage.c:171)
==28136==  Address 0xd8bfbe0 is 0 bytes after a block of size 1,024 alloc'd
==28136==    at 0x4C28C20: malloc (vg_replace_malloc.c:296)
==28136==    by 0x402014: PerfDraw (teximage.c:270)
==28136==    by 0x402648: Draw (glmain.c:182)
==28136==    by 0x8385E63: ??? (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0)
==28136==    by 0x83896C8: fgEnumWindows (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0)
==28136==    by 0x838641C: glutMainLoopEvent (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0)
==28136==    by 0x8386C1C: glutMainLoop (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0)
==28136==    by 0x4019C1: main (glmain.c:262)
==28136==
==28136== Invalid read of size 8
==28136==    at 0x4C2D940: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:915)
==28136==    by 0xB4ADFE3: brw_bo_write (brw_batch.c:1856)
==28136==    by 0xB5B3531: brw_buffer_data (intel_buffer_objects.c:208)
==28136==    by 0xB0F6275: _mesa_buffer_data (bufferobj.c:1600)
==28136==    by 0xB0F6346: _mesa_BufferData (bufferobj.c:1631)
==28136==    by 0xB37A1EE: create_texture_for_pbo (meta_tex_subimage.c:103)
==28136==    by 0xB37A467: _mesa_meta_pbo_TexSubImage (meta_tex_subimage.c:176)
==28136==    by 0xB5C8D61: intelTexSubImage (intel_tex_subimage.c:195)
==28136==    by 0xB254AB4: _mesa_texture_sub_image (teximage.c:3654)
==28136==    by 0xB254C9F: texsubimage (teximage.c:3712)
==28136==    by 0xB2550E9: _mesa_TexSubImage2D (teximage.c:3853)
==28136==    by 0x401CA0: UploadTexSubImage2D (teximage.c:171)
==28136==  Address 0xd8bfbe8 is 8 bytes after a block of size 1,024 alloc'd
==28136==    at 0x4C28C20: malloc (vg_replace_malloc.c:296)
==28136==    by 0x402014: PerfDraw (teximage.c:270)
==28136==    by 0x402648: Draw (glmain.c:182)
==28136==    by 0x8385E63: ??? (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0)
==28136==    by 0x83896C8: fgEnumWindows (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0)
==28136==    by 0x838641C: glutMainLoopEvent (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0)
==28136==    by 0x8386C1C: glutMainLoop (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0)
==28136==    by 0x4019C1: main (glmain.c:262)
==28136==

Fixes regression from commit 7f396189f073d626c5f7a2c232dac92b65f5a23f
Author: Jason Ekstrand <jason.ekstrand@intel.com>
Date:   Mon Jan 5 18:17:04 2015 -0800

    meta: Add a BlitFramebuffers-based implementation of TexSubImage

v2: However, the teximage we create does need to be width x full_height x 1

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Neil Roberts <neil@linux.intel.com>
Reviewed-by Neil Roberts <neil@linux.intel.com>

9 years agoi965/vec4: fill src_reg type using the constructor type parameter
Alejandro Piñeiro [Tue, 1 Sep 2015 15:02:20 +0000 (17:02 +0200)]
i965/vec4: fill src_reg type using the constructor type parameter

The src_reg constructor that received the glsl_type was using it
only to build the swizzle, but not to fill this->type as dst_reg
is doing.

This caused some type mismatch between movs and alu operations
on the NIR path, so copy propagation optimization was not applied
to remove unneeded movs if negate modifier was involved. This was
first detected on minus (negate+add) operations.

Shader DB results (taking into account only vec4):

total instructions in shared programs: 20019 -> 19934 (-0.42%)
instructions in affected programs:     2918 -> 2833 (-2.91%)
helped:                                79
HURT:                                  0
GAINED:                                0
LOST:                                  0

Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agor600g: Add doubles support for CYPRESS
Glenn Kennard [Wed, 12 Aug 2015 00:27:39 +0000 (10:27 +1000)]
r600g: Add doubles support for CYPRESS

This doesn't enable the support, just adds some of
the code, so we don't have to keep rebasing.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600g: add doubles support for CAYMAN
Dave Airlie [Fri, 20 Feb 2015 00:47:15 +0000 (10:47 +1000)]
r600g: add doubles support for CAYMAN

Only a subset of AMD GPUs supported by r600g support doubles,
CAYMAN and CYPRESS are probably all we'll try and support, however
I don't have a CYPRESS so ignore that for now.

This disables SB support for doubles, as we think we need to
make the scheduler smarter to introduce delay slots.

[airlied: pushing this to avoid pain of rebasing, it mostly
works on cayman only so far, Glenn has some ideas about
delay slot issues we need to look into. turned off by
default for now]

Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agotgsi/scan: add uses_doubles to tgsi scanner
Dave Airlie [Fri, 20 Feb 2015 00:40:46 +0000 (10:40 +1000)]
tgsi/scan: add uses_doubles to tgsi scanner

This allows drivers to work out if a shader contains any
double opcodes easily.

Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600g: add multiple stream support for geom shaders
Glenn Kennard [Thu, 9 Jul 2015 06:37:28 +0000 (16:37 +1000)]
r600g: add multiple stream support for geom shaders

This patch is taken from work by Glenn and myself,
and I've spent some time making it all work here.

This adds support for the multiple streams part of
ARB_gpu_shader5 to r600g.

It doesn't enable ARB_gpu_shader5 yet.

Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600g/sb: add support for multiple streams to SB backend
Dave Airlie [Thu, 9 Jul 2015 06:36:16 +0000 (16:36 +1000)]
r600g/sb: add support for multiple streams to SB backend

This adds a peephole and removes an assert that isn't
actually valid with some of the stream emit instructions.

Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600g: add support for streams to the assembler.
Dave Airlie [Thu, 9 Jul 2015 06:30:26 +0000 (16:30 +1000)]
r600g: add support for streams to the assembler.

This just adds support to the assembler dumper and allows
stream instructions to be generated. Also fix up the stream
debugging to add stream info.

Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600g/sb: dump sampler/resource index modes for textures.
Dave Airlie [Tue, 25 Aug 2015 01:18:48 +0000 (11:18 +1000)]
r600g/sb: dump sampler/resource index modes for textures.

This just aids debugging.

Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agomesa/readpixels: check strides are equal before skipping conversion
Dave Airlie [Tue, 1 Sep 2015 05:57:02 +0000 (15:57 +1000)]
mesa/readpixels: check strides are equal before skipping conversion

The CTS packed_pixels test checks that readpixels doesn't write
into the space between rows, however we fail that here unless
we check the format and stride match.

This fixes all the core mesa problems with CTS packed_pixels
tests.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agotexcompress_s3tc/fxt1: fix stride checks (v1.1)
Dave Airlie [Tue, 1 Sep 2015 05:44:46 +0000 (15:44 +1000)]
texcompress_s3tc/fxt1: fix stride checks (v1.1)

The fastpath currently checks the RowLength != width, but
if you have a RowLength of 7, and Alignment of 4, then
that shouldn't match.

align the rowlength to the pack alignment before comparing.

This fixes compressed cases in CTS packed_pixels_pixelstore
test when SKIP_PIXELS is enabled, which causes row length
to get set.

v1.1: add fxt1 fix (Iago)

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>