mesa.git
9 years agoi965/fs: Handle MRF destinations in lower_integer_multiplication().
Matt Turner [Wed, 2 Sep 2015 05:00:24 +0000 (22:00 -0700)]
i965/fs: Handle MRF destinations in lower_integer_multiplication().

The lowered code reads from the destination, which isn't possible from
message registers.

Fixes the following dEQP tests on SNB:

    dEQP-GLES3.functional.shaders.precision.int.highp_mul_fragment
    dEQP-GLES3.functional.shaders.precision.int.mediump_mul_fragment
    dEQP-GLES3.functional.shaders.precision.int.lowp_mul_fragment

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agodocs: document VMware OpenGL 3.3 support
Brian Paul [Thu, 13 Aug 2015 20:50:13 +0000 (13:50 -0700)]
docs: document VMware OpenGL 3.3 support

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: update driver for version 10 GPU interface
Brian Paul [Thu, 13 Aug 2015 18:00:58 +0000 (11:00 -0700)]
svga: update driver for version 10 GPU interface

This is a squash commit of roughly two years of development work.
Authors include:
  Brian Paul
  Charmaine Lee
  Thomas Hellstrom
  Jakob Bornecrantz
  Sinclair Yeh
  Mingcheng Chen
  Kai Ninomiya
  MengLin Wu

The driver supports OpenGL 3.3.

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new version 10 device command prototypes
Brian Paul [Fri, 7 Aug 2015 21:41:17 +0000 (15:41 -0600)]
svga: add new version 10 device command prototypes

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_streamout.h file
Brian Paul [Fri, 7 Aug 2015 21:23:51 +0000 (15:23 -0600)]
svga: add new svga_streamout.h file

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_state_tgsi_transform.c file
Brian Paul [Fri, 7 Aug 2015 22:04:03 +0000 (16:04 -0600)]
svga: add new svga_state_tgsi_transform.c file

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_state_sampler.c file
Brian Paul [Fri, 7 Aug 2015 21:22:18 +0000 (15:22 -0600)]
svga: add new svga_state_sampler.c file

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_state_gs.c file
Brian Paul [Fri, 7 Aug 2015 21:22:01 +0000 (15:22 -0600)]
svga: add new svga_state_gs.c file

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_pipe_streamout.c file
Brian Paul [Fri, 7 Aug 2015 21:21:46 +0000 (15:21 -0600)]
svga: add new svga_pipe_streamout.c file

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_pipe_gs.c file
Brian Paul [Fri, 7 Aug 2015 21:21:29 +0000 (15:21 -0600)]
svga: add new svga_pipe_gs.c file

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_link.[ch] files
Brian Paul [Fri, 7 Aug 2015 21:21:10 +0000 (15:21 -0600)]
svga: add new svga_link.[ch] files

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_cmd_vgpu10.c file
Brian Paul [Fri, 7 Aug 2015 20:57:22 +0000 (14:57 -0600)]
svga: add new svga_cmd_vgpu10.c file

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new svga_tgsi_vgpu10.c file
Brian Paul [Fri, 7 Aug 2015 20:56:51 +0000 (14:56 -0600)]
svga: add new svga_tgsi_vgpu10.c file

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: remove unused SVGA3D_* command functions
Brian Paul [Fri, 7 Aug 2015 22:11:14 +0000 (16:11 -0600)]
svga: remove unused SVGA3D_* command functions

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agogallium/st: add pipe_context::get_timestamp()
Brian Paul [Fri, 7 Aug 2015 20:54:24 +0000 (14:54 -0600)]
gallium/st: add pipe_context::get_timestamp()

The VMware svga driver doesn't directly support pipe_screen::get_timestamp()
but we can do a work-around.  However, we need a gallium context to do so.
This patch adds a new pipe_context::get_timestamp() function that will only
be called if the pipe_screen::get_timestamp() function is NULL.

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga/winsys: Add support for VGPU10
Brian Paul [Thu, 6 Aug 2015 22:44:35 +0000 (16:44 -0600)]
svga/winsys: Add support for VGPU10

This involves a few driver modifications to keep things building.
The driver may not actually run properly at this point.

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: update the svga3d device header files
Brian Paul [Thu, 6 Aug 2015 22:28:19 +0000 (16:28 -0600)]
svga: update the svga3d device header files

Remove some obsolete svga_dump.c code for items which no longer exist.

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agosvga: add new version 10 device header files
Brian Paul [Fri, 7 Aug 2015 20:56:03 +0000 (14:56 -0600)]
svga: add new version 10 device header files

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agowinsys/svga: add new vmw_query.c[h] files
Brian Paul [Wed, 29 Jul 2015 17:23:29 +0000 (11:23 -0600)]
winsys/svga: add new vmw_query.c[h] files

Functions for creating, destroying, getting queries, etc.

Signed-off-by: Brian Paul <brianp@vmware.com>
9 years agometa: Compute correct buffer size with SkipRows/SkipPixels
Chris Wilson [Tue, 1 Sep 2015 08:31:15 +0000 (09:31 +0100)]
meta: Compute correct buffer size with SkipRows/SkipPixels

If the user is specifying a subregion of a buffer using SKIP_ROWS and
SKIP_PIXELS, we must compute the buffer size carefully as the end of the
last row may be much shorter than stride*image_height*depth. The current
code tries to memcpy from beyond the end of the user data, for example
causing:

==28136== Invalid read of size 8
==28136==    at 0x4C2D94E: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:915)
==28136==    by 0xB4ADFE3: brw_bo_write (brw_batch.c:1856)
==28136==    by 0xB5B3531: brw_buffer_data (intel_buffer_objects.c:208)
==28136==    by 0xB0F6275: _mesa_buffer_data (bufferobj.c:1600)
==28136==    by 0xB0F6346: _mesa_BufferData (bufferobj.c:1631)
==28136==    by 0xB37A1EE: create_texture_for_pbo (meta_tex_subimage.c:103)
==28136==    by 0xB37A467: _mesa_meta_pbo_TexSubImage (meta_tex_subimage.c:176)
==28136==    by 0xB5C8D61: intelTexSubImage (intel_tex_subimage.c:195)
==28136==    by 0xB254AB4: _mesa_texture_sub_image (teximage.c:3654)
==28136==    by 0xB254C9F: texsubimage (teximage.c:3712)
==28136==    by 0xB2550E9: _mesa_TexSubImage2D (teximage.c:3853)
==28136==    by 0x401CA0: UploadTexSubImage2D (teximage.c:171)
==28136==  Address 0xd8bfbe0 is 0 bytes after a block of size 1,024 alloc'd
==28136==    at 0x4C28C20: malloc (vg_replace_malloc.c:296)
==28136==    by 0x402014: PerfDraw (teximage.c:270)
==28136==    by 0x402648: Draw (glmain.c:182)
==28136==    by 0x8385E63: ??? (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0)
==28136==    by 0x83896C8: fgEnumWindows (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0)
==28136==    by 0x838641C: glutMainLoopEvent (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0)
==28136==    by 0x8386C1C: glutMainLoop (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0)
==28136==    by 0x4019C1: main (glmain.c:262)
==28136==
==28136== Invalid read of size 8
==28136==    at 0x4C2D940: memcpy@@GLIBC_2.14 (vg_replace_strmem.c:915)
==28136==    by 0xB4ADFE3: brw_bo_write (brw_batch.c:1856)
==28136==    by 0xB5B3531: brw_buffer_data (intel_buffer_objects.c:208)
==28136==    by 0xB0F6275: _mesa_buffer_data (bufferobj.c:1600)
==28136==    by 0xB0F6346: _mesa_BufferData (bufferobj.c:1631)
==28136==    by 0xB37A1EE: create_texture_for_pbo (meta_tex_subimage.c:103)
==28136==    by 0xB37A467: _mesa_meta_pbo_TexSubImage (meta_tex_subimage.c:176)
==28136==    by 0xB5C8D61: intelTexSubImage (intel_tex_subimage.c:195)
==28136==    by 0xB254AB4: _mesa_texture_sub_image (teximage.c:3654)
==28136==    by 0xB254C9F: texsubimage (teximage.c:3712)
==28136==    by 0xB2550E9: _mesa_TexSubImage2D (teximage.c:3853)
==28136==    by 0x401CA0: UploadTexSubImage2D (teximage.c:171)
==28136==  Address 0xd8bfbe8 is 8 bytes after a block of size 1,024 alloc'd
==28136==    at 0x4C28C20: malloc (vg_replace_malloc.c:296)
==28136==    by 0x402014: PerfDraw (teximage.c:270)
==28136==    by 0x402648: Draw (glmain.c:182)
==28136==    by 0x8385E63: ??? (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0)
==28136==    by 0x83896C8: fgEnumWindows (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0)
==28136==    by 0x838641C: glutMainLoopEvent (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0)
==28136==    by 0x8386C1C: glutMainLoop (in /usr/lib/x86_64-linux-gnu/libglut.so.3.9.0)
==28136==    by 0x4019C1: main (glmain.c:262)
==28136==

Fixes regression from commit 7f396189f073d626c5f7a2c232dac92b65f5a23f
Author: Jason Ekstrand <jason.ekstrand@intel.com>
Date:   Mon Jan 5 18:17:04 2015 -0800

    meta: Add a BlitFramebuffers-based implementation of TexSubImage

v2: However, the teximage we create does need to be width x full_height x 1

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Neil Roberts <neil@linux.intel.com>
Reviewed-by Neil Roberts <neil@linux.intel.com>

9 years agoi965/vec4: fill src_reg type using the constructor type parameter
Alejandro Piñeiro [Tue, 1 Sep 2015 15:02:20 +0000 (17:02 +0200)]
i965/vec4: fill src_reg type using the constructor type parameter

The src_reg constructor that received the glsl_type was using it
only to build the swizzle, but not to fill this->type as dst_reg
is doing.

This caused some type mismatch between movs and alu operations
on the NIR path, so copy propagation optimization was not applied
to remove unneeded movs if negate modifier was involved. This was
first detected on minus (negate+add) operations.

Shader DB results (taking into account only vec4):

total instructions in shared programs: 20019 -> 19934 (-0.42%)
instructions in affected programs:     2918 -> 2833 (-2.91%)
helped:                                79
HURT:                                  0
GAINED:                                0
LOST:                                  0

Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agor600g: Add doubles support for CYPRESS
Glenn Kennard [Wed, 12 Aug 2015 00:27:39 +0000 (10:27 +1000)]
r600g: Add doubles support for CYPRESS

This doesn't enable the support, just adds some of
the code, so we don't have to keep rebasing.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600g: add doubles support for CAYMAN
Dave Airlie [Fri, 20 Feb 2015 00:47:15 +0000 (10:47 +1000)]
r600g: add doubles support for CAYMAN

Only a subset of AMD GPUs supported by r600g support doubles,
CAYMAN and CYPRESS are probably all we'll try and support, however
I don't have a CYPRESS so ignore that for now.

This disables SB support for doubles, as we think we need to
make the scheduler smarter to introduce delay slots.

[airlied: pushing this to avoid pain of rebasing, it mostly
works on cayman only so far, Glenn has some ideas about
delay slot issues we need to look into. turned off by
default for now]

Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agotgsi/scan: add uses_doubles to tgsi scanner
Dave Airlie [Fri, 20 Feb 2015 00:40:46 +0000 (10:40 +1000)]
tgsi/scan: add uses_doubles to tgsi scanner

This allows drivers to work out if a shader contains any
double opcodes easily.

Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600g: add multiple stream support for geom shaders
Glenn Kennard [Thu, 9 Jul 2015 06:37:28 +0000 (16:37 +1000)]
r600g: add multiple stream support for geom shaders

This patch is taken from work by Glenn and myself,
and I've spent some time making it all work here.

This adds support for the multiple streams part of
ARB_gpu_shader5 to r600g.

It doesn't enable ARB_gpu_shader5 yet.

Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600g/sb: add support for multiple streams to SB backend
Dave Airlie [Thu, 9 Jul 2015 06:36:16 +0000 (16:36 +1000)]
r600g/sb: add support for multiple streams to SB backend

This adds a peephole and removes an assert that isn't
actually valid with some of the stream emit instructions.

Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600g: add support for streams to the assembler.
Dave Airlie [Thu, 9 Jul 2015 06:30:26 +0000 (16:30 +1000)]
r600g: add support for streams to the assembler.

This just adds support to the assembler dumper and allows
stream instructions to be generated. Also fix up the stream
debugging to add stream info.

Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600g/sb: dump sampler/resource index modes for textures.
Dave Airlie [Tue, 25 Aug 2015 01:18:48 +0000 (11:18 +1000)]
r600g/sb: dump sampler/resource index modes for textures.

This just aids debugging.

Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agomesa/readpixels: check strides are equal before skipping conversion
Dave Airlie [Tue, 1 Sep 2015 05:57:02 +0000 (15:57 +1000)]
mesa/readpixels: check strides are equal before skipping conversion

The CTS packed_pixels test checks that readpixels doesn't write
into the space between rows, however we fail that here unless
we check the format and stride match.

This fixes all the core mesa problems with CTS packed_pixels
tests.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agotexcompress_s3tc/fxt1: fix stride checks (v1.1)
Dave Airlie [Tue, 1 Sep 2015 05:44:46 +0000 (15:44 +1000)]
texcompress_s3tc/fxt1: fix stride checks (v1.1)

The fastpath currently checks the RowLength != width, but
if you have a RowLength of 7, and Alignment of 4, then
that shouldn't match.

align the rowlength to the pack alignment before comparing.

This fixes compressed cases in CTS packed_pixels_pixelstore
test when SKIP_PIXELS is enabled, which causes row length
to get set.

v1.1: add fxt1 fix (Iago)

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agost/readpixels: fix accel path for skipimages.
Dave Airlie [Tue, 1 Sep 2015 05:13:45 +0000 (15:13 +1000)]
st/readpixels: fix accel path for skipimages.

We don't need to use the 3d image address here as that will
include SKIP_IMAGES, and we are only blitting a single
2D anyways, so just use the 2D path.

This fixes some memory overruns under CTS
 packed_pixels.packed_pixels_pixelstore when PACK_SKIP_IMAGES
is used.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agomesa/formats: 8-bit channel integer formats addition
Dave Airlie [Thu, 30 Jul 2015 01:48:37 +0000 (02:48 +0100)]
mesa/formats: 8-bit channel integer formats addition

Add enough 8-bit channel formats to handle all the
different things CTS throws at us.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agomesa/formats: add some formats from GL3.3
Dave Airlie [Thu, 30 Jul 2015 01:48:36 +0000 (02:48 +0100)]
mesa/formats: add some formats from GL3.3

GL3.3 added GL_ARB_texture_rgb10_a2ui, which specifies
a lot more things than just rgb10/a2ui.

While playing with ogl conform one of the tests must
attempted all valid formats for GL3.3 and hits the
unreachable here.

This adds the first chunk of formats that hit the
assert.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agomesa: handle SwapBytes in compressed texture get code.
Dave Airlie [Tue, 25 Aug 2015 11:13:13 +0000 (21:13 +1000)]
mesa: handle SwapBytes in compressed texture get code.

This case just wasn't handled, so add support for it.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agomesa: fix SwapBytes handling in numerous places
Dave Airlie [Tue, 25 Aug 2015 04:36:01 +0000 (14:36 +1000)]
mesa: fix SwapBytes handling in numerous places

In a number of places the SwapBytes handling didn't handle cases with
GL_(UN)PACK_ALIGNMENT set and 7 byte width cases aligned to 8 bytes.

This adds a common routine to swap bytes a 2D image and uses this
code in:

texture storage
texture get
readpixels
swrast drawpixels.

[airlied: updated with Brian's nitpicks].

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agoauxiliary/os: Don't implement os_get_option() on embedded builds.
José Fonseca [Tue, 1 Sep 2015 22:29:17 +0000 (16:29 -0600)]
auxiliary/os: Don't implement os_get_option() on embedded builds.

Let it be defined externally instead, allowing setting mechanisms other
than environment variables.

Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Matthew McClure <mcclurem@vmware.com>
9 years agoutil: add a couple primitive restart helper functions
Brian Paul [Tue, 1 Sep 2015 22:29:17 +0000 (16:29 -0600)]
util: add a couple primitive restart helper functions

The first function translates prim restart indexes to be 0xffff or
0xffffffff.

The second splits indexed primitives with restart indexes into sub-
primitives without restart indexes.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
9 years agotgsi: add tgsi utility to transform a fragment shader to support aa point
Charmaine Lee [Tue, 1 Sep 2015 22:29:17 +0000 (16:29 -0600)]
tgsi: add tgsi utility to transform a fragment shader to support aa point

This adds a tgsi utility tgsi_add_aa_point to transform a fragment shader
to support anti-aliased wide point by computing the fragment distance from
the point center. This utility assumes the geometry shader is emitting
an extra generic output with point coord data. The semantic index of
this generic output is passed to the tgsi_add_aa_point utility.

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agotgsi: adds tgsi utility to transform a shader to support point sprite
Charmaine Lee [Tue, 1 Sep 2015 22:29:17 +0000 (16:29 -0600)]
tgsi: adds tgsi utility to transform a shader to support point sprite

This adds a tgsi utility tgsi_add_point_sprite to transform a geometry
shader to emulate wide points by drawing quads. This utility adds an
extra output for the original point position if the point position is
to be written to a stream output buffer. It also assumes the driver will
add a constant for inverse viewport scale after the user defined constants.

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agotgsi: add new tgsi_two_side.c utility code
Brian Paul [Tue, 1 Sep 2015 22:29:17 +0000 (16:29 -0600)]
tgsi: add new tgsi_two_side.c utility code

This could be used by any driver where the device doesn't directly
support two-sided lighting.  This code modifies a fragment shader
to accecpt back-face colors and choose between the front/back colors
depending on the triangle's front-face sign.

9 years agoutil: add util_strcasecmp() wrapper
Brian Paul [Tue, 1 Sep 2015 22:29:17 +0000 (16:29 -0600)]
util: add util_strcasecmp() wrapper

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
9 years agogallium/util: add a utility to create geometry passthrough shader
Charmaine Lee [Tue, 1 Sep 2015 22:29:17 +0000 (16:29 -0600)]
gallium/util: add a utility to create geometry passthrough shader

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agogallium/util: fix returning empty box for rectangle intersection
Roland Scheidegger [Tue, 1 Sep 2015 22:29:17 +0000 (16:29 -0600)]
gallium/util: fix returning empty box for rectangle intersection

These functions deal with inclusive coordinates, hence a 0/0/0/0 rect
returned when there's no intersection doesn't actually represent an empty
rectangle. Hence return 0/-1/0/-1 instead.
This fixes some problems in llvmpipe with empty scissor rects (which up
to now didn't really matter because while the intersect test returned the
wrong result all pixels were scissored away later anyway).

9 years agogallium/util: return FALSE for intersection if there's empty rectangles
Roland Scheidegger [Tue, 1 Sep 2015 22:29:17 +0000 (16:29 -0600)]
gallium/util: return FALSE for intersection if there's empty rectangles

It isn't really obvious if intersection test should take into account empty
rectangles or if the caller should do it. But it looks like most callers
actually verified one of the rects but not the other, but since correctly
returning an empty rect that other rect could actually be empty leading to
more bugs. Hence just verify both rects for emptyness in the intersection
test itself which makes the code easier in the caller (though it will be
slower if the caller knows the rectangles are non-empty).

Reviewed-by: Zack Rusin <zackr@vmware.com>
9 years agotgsi: add some more helper functions
Charmaine Lee [Tue, 1 Sep 2015 22:29:17 +0000 (16:29 -0600)]
tgsi: add some more helper functions

This patch adds some more helper functions such as
   . tgsi_transform_temps_decl
   . tgsi_transform_output_decl
   . tgsi_transform_dst_reg
   . tgsi_transform_src_reg

Reviewed-by: Brian Paul <brianp@vmware.com>
9 years agotgsi: added tgsi_is_shadow_target() helper
Brian Paul [Tue, 1 Sep 2015 22:29:17 +0000 (16:29 -0600)]
tgsi: added tgsi_is_shadow_target() helper

9 years agotgsi: add negate parameter to tgsi_transform_kill_inst()
Brian Paul [Tue, 1 Sep 2015 22:29:17 +0000 (16:29 -0600)]
tgsi: add negate parameter to tgsi_transform_kill_inst()

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
9 years agoutil: added ffsll() function
Brian Paul [Tue, 1 Sep 2015 22:29:17 +0000 (16:29 -0600)]
util: added ffsll() function

v2: fix errant _GNU_SOURCE test, per Matt Turner.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agoutil: added util_set_index_buffer()
Brian Paul [Tue, 1 Sep 2015 22:29:17 +0000 (16:29 -0600)]
util: added util_set_index_buffer()

Like util_set_vertex_buffers_count(), this basically just copies a
pipe_index_buffer object, taking care of refcounting.

9 years agomesa: Move gl_vert_attrib from mtypes.h to shader_enums.h
Jason Ekstrand [Mon, 31 Aug 2015 21:55:49 +0000 (14:55 -0700)]
mesa: Move gl_vert_attrib from mtypes.h to shader_enums.h

It is a shader enum after all...

Acked-by: Brian Paul <brianp@vmware.com>
9 years agoglapi: Inline x86_64_current_tls().
Matt Turner [Fri, 26 Sep 2014 00:28:20 +0000 (17:28 -0700)]
glapi: Inline x86_64_current_tls().

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
9 years agor600g: Simplify out a couple of unnecessary branches
Edward O'Callaghan [Tue, 1 Sep 2015 08:38:34 +0000 (18:38 +1000)]
r600g: Simplify out a couple of unnecessary branches

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
9 years agoradeonsi: use an indirect buffer for init_config
Marek Olšák [Sun, 30 Aug 2015 16:46:06 +0000 (18:46 +0200)]
radeonsi: use an indirect buffer for init_config

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: add IB2 indirect buffer support for pm4 states
Marek Olšák [Sun, 30 Aug 2015 16:39:19 +0000 (18:39 +0200)]
radeonsi: add IB2 indirect buffer support for pm4 states

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agowinsys/radeon: add a flag telling how gfx IBs should be padded
Marek Olšák [Sun, 30 Aug 2015 15:41:23 +0000 (17:41 +0200)]
winsys/radeon: add a flag telling how gfx IBs should be padded

This is always false on amdgpu (set by calloc).

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agowinsys/amdgpu: remove IB padding for SI
Marek Olšák [Sun, 30 Aug 2015 15:39:03 +0000 (17:39 +0200)]
winsys/amdgpu: remove IB padding for SI

SI is unsupported by amdgpu

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: remove unused macro si_pm4_set_state
Marek Olšák [Sun, 30 Aug 2015 12:43:59 +0000 (14:43 +0200)]
radeonsi: remove unused macro si_pm4_set_state

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: remove si_pm4_cleanup
Marek Olšák [Sun, 30 Aug 2015 12:39:54 +0000 (14:39 +0200)]
radeonsi: remove si_pm4_cleanup

All remaining pm4 state are created and destroyed by state trackers.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: rework uploading border colors
Marek Olšák [Sun, 30 Aug 2015 12:13:10 +0000 (14:13 +0200)]
radeonsi: rework uploading border colors

The border colors are uploaded only once when the state is created.

This brings truly immutable sampler descriptors, because they don't have
to be updated every time a sampler state is re-bound.

It also moves the TA_BC_BASE_ADDR registers to init_config, removing one
more state. The catch is there is now a limit: only 4096 border colors can
be used by one context. I don't think that will be a problem.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: use all built-in border colors
Marek Olšák [Sun, 30 Aug 2015 11:17:15 +0000 (13:17 +0200)]
radeonsi: use all built-in border colors

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: inline si_cmd_context_control
Marek Olšák [Sun, 30 Aug 2015 10:39:45 +0000 (12:39 +0200)]
radeonsi: inline si_cmd_context_control

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: remove unused si_pm4_state code
Marek Olšák [Sun, 30 Aug 2015 10:35:02 +0000 (12:35 +0200)]
radeonsi: remove unused si_pm4_state code

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: reorder si_context variables
Marek Olšák [Sun, 30 Aug 2015 10:25:03 +0000 (12:25 +0200)]
radeonsi: reorder si_context variables

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: don't send IB dword usage to si_need_cs_space
Marek Olšák [Sun, 30 Aug 2015 01:56:13 +0000 (03:56 +0200)]
radeonsi: don't send IB dword usage to si_need_cs_space

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: don't set number of IB dwords for states
Marek Olšák [Sun, 30 Aug 2015 01:53:39 +0000 (03:53 +0200)]
radeonsi: don't set number of IB dwords for states

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: don't count IB space for states, just use an upper bound
Marek Olšák [Sun, 30 Aug 2015 01:49:15 +0000 (03:49 +0200)]
radeonsi: don't count IB space for states, just use an upper bound

Since we don't put any resource descriptors in IBs, the space used by draw
calls is quite small.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: convert SPI state to an atom
Marek Olšák [Sun, 30 Aug 2015 01:17:30 +0000 (03:17 +0200)]
radeonsi: convert SPI state to an atom

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agogallium/radeon: rename r600_context_bo_reloc -> radeon_add_to_buffer_list
Marek Olšák [Sun, 30 Aug 2015 00:04:37 +0000 (02:04 +0200)]
gallium/radeon: rename r600_context_bo_reloc -> radeon_add_to_buffer_list

this name should be easy to understand without other knowledge

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agogallium/radeon: rename write_*_reg functions
Marek Olšák [Sat, 29 Aug 2015 23:54:00 +0000 (01:54 +0200)]
gallium/radeon: rename write_*_reg functions

e.g. radeon_set_context_reg is nicer and looks consistent next to
radeon_emit().

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: rename and precalculate polygon offset states
Marek Olšák [Sat, 29 Aug 2015 23:35:03 +0000 (01:35 +0200)]
radeonsi: rename and precalculate polygon offset states

one less calloc and state construction while drawing

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: convert CB_TARGET_MASK setup to an atom
Marek Olšák [Sat, 29 Aug 2015 22:50:42 +0000 (00:50 +0200)]
radeonsi: convert CB_TARGET_MASK setup to an atom

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: don't set VGT_VTX_CNT_EN twice in init_config
Marek Olšák [Sat, 29 Aug 2015 22:16:01 +0000 (00:16 +0200)]
radeonsi: don't set VGT_VTX_CNT_EN twice in init_config

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: convert stencil ref state into an atom
Marek Olšák [Sat, 29 Aug 2015 15:00:11 +0000 (17:00 +0200)]
radeonsi: convert stencil ref state into an atom

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: convert blend color state into an atom
Marek Olšák [Sat, 29 Aug 2015 13:05:53 +0000 (15:05 +0200)]
radeonsi: convert blend color state into an atom

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: convert sample mask state into an atom
Marek Olšák [Sat, 29 Aug 2015 13:05:53 +0000 (15:05 +0200)]
radeonsi: convert sample mask state into an atom

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: convert clip state into an atom
Marek Olšák [Sat, 29 Aug 2015 12:54:58 +0000 (14:54 +0200)]
radeonsi: convert clip state into an atom

Reducing calloc overhead.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: avoid redundant CB and DB register updates
Marek Olšák [Sat, 29 Aug 2015 00:32:13 +0000 (02:32 +0200)]
radeonsi: avoid redundant CB and DB register updates

The main idea is to avoid setting CB_COLORi_INFO = 0 for i>0 repeatedly
when those colorbuffers aren't used. This is mainly for glamor.

Same for DB. Z_INFO and STENCIL_INFO need to be cleared only once.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: don't rebind GSVS ring buffers every draw call using GS
Marek Olšák [Sat, 29 Aug 2015 00:02:29 +0000 (02:02 +0200)]
radeonsi: don't rebind GSVS ring buffers every draw call using GS

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: don't clear the tessellation factor ring buffer
Marek Olšák [Fri, 28 Aug 2015 23:56:27 +0000 (01:56 +0200)]
radeonsi: don't clear the tessellation factor ring buffer

Leftover from the bring-up.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: remove the tf_ring state, add the registers to init_config
Marek Olšák [Fri, 28 Aug 2015 23:45:28 +0000 (01:45 +0200)]
radeonsi: remove the tf_ring state, add the registers to init_config

One less state to worry about.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: remove the gs_rings state, add the registers to init_config
Marek Olšák [Fri, 28 Aug 2015 23:45:28 +0000 (01:45 +0200)]
radeonsi: remove the gs_rings state, add the registers to init_config

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: use a bitmask for tracking dirty atoms
Marek Olšák [Fri, 28 Aug 2015 22:49:40 +0000 (00:49 +0200)]
radeonsi: use a bitmask for tracking dirty atoms

This mainly removes the cache misses when checking the dirty flags.
Not much else though.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: initialize atom IDs for external atoms
Marek Olšák [Fri, 28 Aug 2015 22:03:02 +0000 (00:03 +0200)]
radeonsi: initialize atom IDs for external atoms

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: call si_init_atom for remaining radeonsi atoms
Marek Olšák [Fri, 28 Aug 2015 21:52:47 +0000 (23:52 +0200)]
radeonsi: call si_init_atom for remaining radeonsi atoms

I need to initialize more atom IDs.

This adds 4 more si_init_atom calls, which simplifies the code.
(si_init_atom needs a different context type of the emit functions though)

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: initialize atom IDs
Marek Olšák [Fri, 28 Aug 2015 21:26:50 +0000 (23:26 +0200)]
radeonsi: initialize atom IDs

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: define the state atom array separately
Marek Olšák [Fri, 28 Aug 2015 19:59:22 +0000 (21:59 +0200)]
radeonsi: define the state atom array separately

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: optimize viewport states
Marek Olšák [Fri, 28 Aug 2015 19:48:37 +0000 (21:48 +0200)]
radeonsi: optimize viewport states

same as scissors

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: optimize scissor states
Marek Olšák [Fri, 28 Aug 2015 19:08:49 +0000 (21:08 +0200)]
radeonsi: optimize scissor states

- convert 16 states to 1 atom
- only emit 1 scissor if VIEWPORT_INDEX isn't written
- use only one packet when emitting consecutive scissors

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: add SI_MAX_ATTRIBS
Marek Olšák [Fri, 28 Aug 2015 20:33:02 +0000 (22:33 +0200)]
radeonsi: add SI_MAX_ATTRIBS

PIPE_MAX_ATTRIBS is 32, but we currently only support 16.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: fix memory usage checking for big IBs
Marek Olšák [Sun, 30 Aug 2015 01:44:03 +0000 (03:44 +0200)]
radeonsi: fix memory usage checking for big IBs

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: set all 16 viewport Z bounds for GL 4.1
Marek Olšák [Sat, 29 Aug 2015 22:12:03 +0000 (00:12 +0200)]
radeonsi: set all 16 viewport Z bounds for GL 4.1

Cc: 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoradeonsi: fix a Unigine Heaven hang when drirc is missing
Marek Olšák [Sat, 29 Aug 2015 20:59:23 +0000 (22:59 +0200)]
radeonsi: fix a Unigine Heaven hang when drirc is missing

Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agowinsys/amdgpu: use small IBs for better performance on VI
Marek Olšák [Sun, 30 Aug 2015 09:59:23 +0000 (11:59 +0200)]
winsys/amdgpu: use small IBs for better performance on VI

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agogallium/util: add u_bit_scan_consecutive_range
Marek Olšák [Fri, 28 Aug 2015 18:53:08 +0000 (20:53 +0200)]
gallium/util: add u_bit_scan_consecutive_range

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
9 years agoi965: Prevent coordinate overflow in intel_emit_linear_blit
Chris Wilson [Sat, 6 Jun 2015 08:33:33 +0000 (09:33 +0100)]
i965: Prevent coordinate overflow in intel_emit_linear_blit

Fixes regression from
commit 8c17d53823c77ac1c56b0548e4e54f69a33285f1
Author: Kenneth Graunke <kenneth@whitecape.org>
Date:   Wed Apr 15 03:04:33 2015 -0700

    i965: Make intel_emit_linear_blit handle Gen8+ alignment restrictions.

which adjusted the coordinates to be relative to the nearest cacheline.
However, this then offsets the coordinates by up to 63 and this may then
cause them to overflow the BLT limits. For the well aligned large
transfer case, we can use 32bpp pixels and so reduce the coordinates by
4 (versus the current 8bpp pixels). We also have to be more careful
doing the last line just in case it may exceed the coordinate limit.

Reported-and-tested-by: kaillasse91@hotmail.fr
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90734
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Kenneth Graunke <kenneth@whitecape.org>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Anuj Phogat <anuj.phogat@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
9 years agoi965/nir: enable the dead control flow optimization
Connor Abbott [Fri, 1 May 2015 06:51:12 +0000 (02:51 -0400)]
i965/nir: enable the dead control flow optimization

total instructions in shared programs: 7541551 -> 7541381 (-0.00%)
instructions in affected programs:     3054 -> 2884 (-5.57%)
helped:                                29

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/dead_cf: add support for removing useless loops
Connor Abbott [Fri, 8 May 2015 18:42:14 +0000 (14:42 -0400)]
nir/dead_cf: add support for removing useless loops

v2: fix detecting if the loop has any phi nodes after it.
v2: use nir_foreach_ssa_def() instead of nir_foreach_dest() when
    checking for values live after the loop to catch const_load
    instructions.
v2: fix handling return instructions
v2: add some documentation to loop_is_dead()

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agonir: add a helper for iterating over blocks in a cf node
Connor Abbott [Fri, 8 May 2015 18:40:58 +0000 (14:40 -0400)]
nir: add a helper for iterating over blocks in a cf node

We were already doing this internally for iterating over a function
implementation, so just expose it directly.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir: add nir_block_get_following_loop() helper
Connor Abbott [Fri, 8 May 2015 17:17:10 +0000 (13:17 -0400)]
nir: add nir_block_get_following_loop() helper

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agonir/dead_cf: delete code that's unreachable due to jumps
Connor Abbott [Fri, 8 May 2015 05:44:24 +0000 (01:44 -0400)]
nir/dead_cf: delete code that's unreachable due to jumps

v2: use nir_cf_node_remove_after().
v2: use foreach_list_typed() instead of hardcoding a list walk.
v3: update to new control flow modification helpers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>