mesa.git
10 years agoc11/threads: correct assertion
Emil Velikov [Fri, 1 Aug 2014 16:39:49 +0000 (17:39 +0100)]
c11/threads: correct assertion

We should assert when either the function or the flag pointer
is null or we'll end up with a null reference a few lines later.

Currently unused by mesa thus it has gone unnoticed.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agodocs: now distributing the GL/glcorearb.h header
Brian Paul [Tue, 12 Aug 2014 16:24:00 +0000 (10:24 -0600)]
docs: now distributing the GL/glcorearb.h header

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agomesa: pull Khronos glcorearb.h header into include/GL/
Brian Paul [Tue, 12 Aug 2014 16:20:30 +0000 (10:20 -0600)]
mesa: pull Khronos glcorearb.h header into include/GL/

Apps that only want to use core functionality should #include this
header.  This version covers everything up to OpenGL 4.5.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agovc4: Drop the dump_fbo() routine.
Eric Anholt [Mon, 11 Aug 2014 19:47:30 +0000 (12:47 -0700)]
vc4: Drop the dump_fbo() routine.

Now that eglkms is working, and some tests are working under
PIGLIT_PLATFORM=gbm, I don't think I need this any more.

10 years agovc4: Claim the GL 2.1 minimum for 3D textures.
Eric Anholt [Tue, 12 Aug 2014 17:06:48 +0000 (10:06 -0700)]
vc4: Claim the GL 2.1 minimum for 3D textures.

We don't actually do them (or even fake them) currently, but it does get
us a bunch of unrelated glean glsl1 tests passing, which previously would
error out due to glean assuming the minimums on a 3D texture that 2 of the
subtests use.

10 years agovc4: Declare what vertex formats we actually support.
Eric Anholt [Mon, 11 Aug 2014 23:03:17 +0000 (16:03 -0700)]
vc4: Declare what vertex formats we actually support.

We will support more than this eventually, but for now this makes u_vbuf
format-convert a few things (32-bit snorm and scaled, doubles) for us.

10 years agovc4: Stash some debug code for format support checks.
Eric Anholt [Mon, 11 Aug 2014 23:00:28 +0000 (16:00 -0700)]
vc4: Stash some debug code for format support checks.

This can be useful for looking at context init setup and texture format
choices, and there's no reason for the silly retval computation we do if
you're not going to have this code (mostly from freedreno) around.

10 years agovc4: Texture format support has nothing to do with VBO format support.
Eric Anholt [Mon, 11 Aug 2014 22:55:45 +0000 (15:55 -0700)]
vc4: Texture format support has nothing to do with VBO format support.

This was inherited from freedreno, but doesn't apply to us.

10 years agovc4: Fix off-by-one in texture maximum levels.
Eric Anholt [Mon, 11 Aug 2014 22:37:05 +0000 (15:37 -0700)]
vc4: Fix off-by-one in texture maximum levels.

It's 2048x2048 that's the max, not 1024x1024.

10 years agovc4: Add support for the FLR opcode.
Eric Anholt [Mon, 11 Aug 2014 22:24:43 +0000 (15:24 -0700)]
vc4: Add support for the FLR opcode.

10 years agoi965: Delete the Gen8 code generators.
Kenneth Graunke [Mon, 11 Aug 2014 17:07:07 +0000 (10:07 -0700)]
i965: Delete the Gen8 code generators.

We now use the brw_eu_emit.c code instead.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Never use the Gen8 code generators.
Kenneth Graunke [Mon, 11 Aug 2014 17:05:01 +0000 (10:05 -0700)]
i965: Never use the Gen8 code generators.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Switch to the EU emit layer for code generation on Broadwell.
Kenneth Graunke [Mon, 30 Jun 2014 16:04:26 +0000 (09:04 -0700)]
i965: Switch to the EU emit layer for code generation on Broadwell.

Everything should be in place to unify code generation between Gen4-7
and Gen8+.  We should be able to drop the Gen8 generators at this point.

However, leave them hooked up for a brief moment, for testing and
comparison purposes.  Set GEN8=1 to use the old Gen8+ code generator
paths.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Retype atomics to UD in Gen8 code generation.
Kenneth Graunke [Sat, 12 Jul 2014 00:07:03 +0000 (17:07 -0700)]
i965: Retype atomics to UD in Gen8 code generation.

Kind of a moot point since we're deleting Gen8 code generation, but
this at least helps make it match the Gen4-7 code.  It's probably more
reasonable than using float.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965/vp: Use the sampler for pull constant loads on Gen7/7.5.
Kenneth Graunke [Mon, 11 Aug 2014 05:49:55 +0000 (22:49 -0700)]
i965/vp: Use the sampler for pull constant loads on Gen7/7.5.

This improves performance in Trine 2 at 1280x720 (windowed) on "Very
High" settings by 30% (in the interactive menu) to 45% (in the forest
by the giant frog) on Haswell GT3e.

It also now generates the same assembly on Gen7 as it does on Gen8,
which always used the sampler for both types.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965/vec4: Drop gen <= 7 assertion in pull constant load handling.
Kenneth Graunke [Mon, 11 Aug 2014 03:41:42 +0000 (20:41 -0700)]
i965/vec4: Drop gen <= 7 assertion in pull constant load handling.

I don't see any reason for this to exist.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965/eu: Set src0 file to IMM on Gen8+ flow control instructions.
Kenneth Graunke [Sun, 10 Aug 2014 14:10:55 +0000 (07:10 -0700)]
i965/eu: Set src0 file to IMM on Gen8+ flow control instructions.

According to the documentation, we need to set the source 0 register
type to IMM for flow control instructinos that have both JIP and UIP.
Out of paranoia, just make all flow control instructions use IMM;
there's no benefit to using ARF anyway, and it could trouble that's
difficult to diagnose.

See commit 9584959123b0453cf5313722357e3abb9f736aa7, which did the
analogous change in the gen8_generator code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965/eu: Refactor brw_WHILE to share a bit more code on Gen6+.
Kenneth Graunke [Sun, 10 Aug 2014 14:06:36 +0000 (07:06 -0700)]
i965/eu: Refactor brw_WHILE to share a bit more code on Gen6+.

We're going to add a Gen8+ case shortly, which would need to duplicate
this code again.  Instead, share it.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965/eu: Emulate F32TO16 and F16TO32 on Broadwell.
Kenneth Graunke [Sat, 28 Jun 2014 23:08:39 +0000 (16:08 -0700)]
i965/eu: Emulate F32TO16 and F16TO32 on Broadwell.

When we combine the Gen4-7 and Gen8+ generators, we'll need to handle
half float packing/unpacking functions somehow.  The Gen8+ generator
code today just emulates the behavior of the Gen7 F32TO16/F16TO32
instructions, including the align16 mode bugs.

Rather than messing with fs_generator/vec4_generator, I decided to just
emulate the instructions at the brw_eu_emit.c layer.

v2: Change gen >= 7 asserts to gen == 7 (suggested by Chris Forbes).
    Fix regressions on Haswell in VS tests due to type assertions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965/vec4: Port Gen8 SET_VERTEX_COUNT handling to vec4_generator.
Kenneth Graunke [Mon, 11 Aug 2014 22:53:54 +0000 (15:53 -0700)]
i965/vec4: Port Gen8 SET_VERTEX_COUNT handling to vec4_generator.

Broadwell requires the number of vertices written by the geometry shader
to be specified in a separate register, as part of the terminating
message's payload.

This also means GS_OPCODE_THREAD_END needs to increment mlen.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965/vec4: Switch to MOV, not OR, for GS_OPCODE_THREAD_END on Gen8.
Kenneth Graunke [Mon, 11 Aug 2014 15:13:05 +0000 (08:13 -0700)]
i965/vec4: Switch to MOV, not OR, for GS_OPCODE_THREAD_END on Gen8.

Either should work.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965/vec4: Use MOV, not OR, to set URB write channel mask bits.
Kenneth Graunke [Mon, 11 Aug 2014 03:06:44 +0000 (20:06 -0700)]
i965/vec4: Use MOV, not OR, to set URB write channel mask bits.

g0.5 has nothing of value to contribute to m0.5.  In both the VS and GS
payload, g0.5 contains the scratch space pointer - which is definitely
not of any use.  The GS payload also contains FFTID, but the URB write
message header doesn't want FFTID.

The only reason I used OR was because Eric originally requested it.
On Broadwell, I used MOV, and that's worked out fine.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965/fs: Don't set flag_subreg_nr = 1 on predicated FB write setup.
Kenneth Graunke [Sun, 10 Aug 2014 23:15:51 +0000 (16:15 -0700)]
i965/fs: Don't set flag_subreg_nr = 1 on predicated FB write setup.

On Haswell, we implement "discard" via predicated SEND messages, using
f0.1 instead of f0.0.  To accomplish this, we set inst->flag_subreg to 1
on the FS_OPCODE_FB_WRITE.

Most instructions using fs_inst::flag_subreg expand to a single assembly
instruction.  However, FS_OPCODE_FB_WRITE can generate several MOVs for
setting up header information.  We don't want to set flag_subreg on
those, so override the default state back to 0.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/vec4: Respect ir->force_writemask_all in Gen8 code generation.
Kenneth Graunke [Mon, 11 Aug 2014 15:41:36 +0000 (08:41 -0700)]
i965/vec4: Respect ir->force_writemask_all in Gen8 code generation.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agoi965/vec4: Set NoMask for GS_OPCODE_SET_VERTEX_COUNT on Gen8+.
Kenneth Graunke [Mon, 11 Aug 2014 15:15:57 +0000 (08:15 -0700)]
i965/vec4: Set NoMask for GS_OPCODE_SET_VERTEX_COUNT on Gen8+.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
10 years agogallium/r300: Fix a link error in the tests
Jason Ekstrand [Tue, 12 Aug 2014 18:12:47 +0000 (11:12 -0700)]
gallium/r300: Fix a link error in the tests

The link error occurs because the static libraries are linked in the wrong
order.  This fixes it.

Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82483
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
10 years agoi965: Return NONE from brw_swap_cmod on unknown input.
Matt Turner [Mon, 11 Aug 2014 18:12:43 +0000 (11:12 -0700)]
i965: Return NONE from brw_swap_cmod on unknown input.

Comparing ~0u with a packed enum (i.e., 1 byte) always evaluates to
false. Shouldn't gcc warn about this?

Reported-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agodocs: Update release notes and GL3.txt for GL_ARB_texture_compression_bptc
Neil Roberts [Wed, 23 Jul 2014 10:25:31 +0000 (11:25 +0100)]
docs: Update release notes and GL3.txt for GL_ARB_texture_compression_bptc

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa/meta: Support decompressing floating-point formats
Neil Roberts [Thu, 31 Jul 2014 13:07:50 +0000 (14:07 +0100)]
mesa/meta: Support decompressing floating-point formats

Previously the Meta implementation of glGetTexImage would fall back to
_mesa_get_teximage if the texturing is not using an unsigned normalised
format. However in order to support the half-float formats of BPTC textures we
can make it render to a floating-point renderbuffer instead. This patch makes
decompression_state have two FBOs, one for the GL_RGBA format and one for
GL_RGBA32F. If a floating-point texture is encountered it will try setting up
a floating-point FBO. It will now also check the status of the FBO and fall
back to _mesa_get_teximage if the FBO is not complete.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoswrast: Enable GL_ARB_texture_compression_bptc
Neil Roberts [Thu, 17 Jul 2014 13:45:01 +0000 (14:45 +0100)]
swrast: Enable GL_ARB_texture_compression_bptc

Enables BPTC texture compression on the software rasterizer.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965: Enable the GL_ARB_texture_compression_bptc extension
Neil Roberts [Thu, 17 Jul 2014 13:38:20 +0000 (14:38 +0100)]
i965: Enable the GL_ARB_texture_compression_bptc extension

Enables the BPTC extension on Gen>=7 and adds the necessary format mappings to
get the right surface type value.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa/main: Modify generate_mipmap_compressed to cope with float textures
Neil Roberts [Fri, 25 Jul 2014 16:38:22 +0000 (17:38 +0100)]
mesa/main: Modify generate_mipmap_compressed to cope with float textures

Once we add BPTC texture support we will need to generate mipmaps for
compressed floating point textures too. Most of the code seems to already be
there but it just needs a few extra lines to get it to use GL_FLOAT instead of
GL_UNSIGNED_BYTE as the type for the temporary buffers.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa: Add texstore functions for BPTC-compressed textures
Neil Roberts [Thu, 17 Jul 2014 13:33:10 +0000 (14:33 +0100)]
mesa: Add texstore functions for BPTC-compressed textures

This adds compressors for all four of the BPTC compressed-texture formats. The
compressor is written from scratch and takes a very simple approach. It always
uses a single mode of the BPTC format (4 for unorm and 3 for half-floats) and
picks the two endpoints by dividing the texels into those which have more or
less than the average luminance of the block and then calculating an average
color of the texels within each division.

It's probably not really sensible to try to use BPTC compression at runtime
because for example with the Nvidia offline compression tool it can take in
the order of an hour to compress a full-screen image. With that in mind I
don't think it's worth having a proper compressor in Mesa and this approach
gives reasonable results for a usage that is basically a corner case.

v2: Always use the custom compressor, even for the unorm formats. Fix the
    quantization step for the half-float format compressor. Fixed a typo which
    was breaking the right-hand edge of half-float textures with a width that
    isn't a multiple of four.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa: Add texel fetch functions for BPTC-compressed textures
Neil Roberts [Thu, 17 Jul 2014 13:30:29 +0000 (14:30 +0100)]
mesa: Add texel fetch functions for BPTC-compressed textures

Adds functions to fetch from any of the four BPTC-compressed formats.

v2: Set the alpha component to 1.0 when fetching from the half-float formats
    instead of leaving it uninitialised. Don't linearize the alpha component
    when fetching from sRGB.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa: Add the format enums for BPTC-compressed images
Neil Roberts [Thu, 17 Jul 2014 13:18:27 +0000 (14:18 +0100)]
mesa: Add the format enums for BPTC-compressed images

This adds the following four Mesa image format enums which correspond to the
four BPTC compressed texture formats:

 MESA_FORMAT_BPTC_RGBA_UNORM
 MESA_FORMAT_BPTC_SRGB_ALPHA_UNORM
 MESA_FORMAT_BPTC_RGB_SIGNED_FLOAT
 MESA_FORMAT_BPTC_RGB_UNSIGNED_FLOAT

It also updates the format information functions to handle these and the
corresponding GL enums.

v2: Also modify _mesa_get_format_color_encoding, _mesa_get_srgb_format_linear
    and _mesa_get_uncompressed_format

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa/format_info: Add support for the BPTC layout
Neil Roberts [Wed, 6 Aug 2014 15:52:14 +0000 (16:52 +0100)]
mesa/format_info: Add support for the BPTC layout

Adds the ‘bptc’ layout to get_channel_bits. The channel bits for BPTC depend
on the mode but as it only has to be an approximation this sets it to 8 for
the two UNORM formats and 16 for the two half-float formats. These represent
the minimum number of bits of variation that can be generated by the
interpolation of the two formats.

This doesn't quite match what we do for S3TC which only returns 4 even though
it can similarly generate 8 bits from the interpolation. However it does match
what we return for ETC2. For reference, NVidia seems to return 8 bits for the
UNORM formats and 32 bits for the half-float formats.

v2: Change the number of bits to 8/8/8/8 for the UNORM formats and 16/16/16
    for the half-float formats.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
10 years agomesa/format_info: Add support for compressed floating-point formats
Neil Roberts [Wed, 6 Aug 2014 15:52:14 +0000 (16:52 +0100)]
mesa/format_info: Add support for compressed floating-point formats

If the name of a compressed texture format has ‘FLOAT’ in it it will now set
the data type of the format to GL_FLOAT. This will be needed for the BPTC
half-float formats.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa: Fix the base format for GL_COMPRESSED_RGB_BPTC_*_FLOAT_ARB
Neil Roberts [Fri, 25 Jul 2014 10:31:07 +0000 (11:31 +0100)]
mesa: Fix the base format for GL_COMPRESSED_RGB_BPTC_*_FLOAT_ARB

The signed and unsigned half-float BPTC-compressed formats were being reported
as having a base format of GL_RGBA but they don't store an alpha channel so it
should be GL_RGB.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa: Add the GL_ARB_texture_compression_bptc extension
Neil Roberts [Thu, 17 Jul 2014 13:21:26 +0000 (14:21 +0100)]
mesa: Add the GL_ARB_texture_compression_bptc extension

This adds a boolean in the gl_extensions struct for
GL_ARB_texture_compression_bptc as well as an entry in extension_table.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agowinsys/radeon: fix nop packet padding for hawaii
Andreas Boll [Mon, 4 Aug 2014 10:48:50 +0000 (12:48 +0200)]
winsys/radeon: fix nop packet padding for hawaii

The initial firmware for hawaii does not support type3 nop packet.
Detect the new hawaii firmware with query RADEON_INFO_ACCEL_WORKING2.
If the returned value is 3, then the new firmware is used.

This patch uses type2 for the old firmware and type3 for the new firmware.

It fixes the cases when the old firmware is used and the user wants to
manually enable acceleration.
The two possible scenarios are:
 - the kernel has no support for the new firmware.
 - the kernel has support for the new firmware but only the old firmware
   is available.

Additionaly this patch disables GPU acceleration on hawaii if the kernel
returns a value < 2. In this case the kernel hasn't the required fixes
for proper acceleration.

v2:
 - Fix indentation
 - Use private struct radeon_drm_winsys instead of public struct radeon_info
 - Rename r600_accel_working2 to accel_working2

v3:
 - Use type2 nop packet for returned value < 3

v4:
 - Fail to initialize winsys for returned value < 2

Cc: mesa-stable@lists.freedesktop.org
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Jérôme Glisse <jglisse@redhat.com>
Cc: Marek Olšák <marek.olsak@amd.com>
Cc: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
10 years agomesa: regenerate gl_mangle.h
Brian Paul [Tue, 12 Aug 2014 14:01:41 +0000 (08:01 -0600)]
mesa: regenerate gl_mangle.h

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agomesa: update wglext.h to version 20140810
Brian Paul [Tue, 12 Aug 2014 13:31:46 +0000 (07:31 -0600)]
mesa: update wglext.h to version 20140810

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agomesa: update glxext.h to version 20140810
Brian Paul [Tue, 12 Aug 2014 13:31:25 +0000 (07:31 -0600)]
mesa: update glxext.h to version 20140810

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agomesa: update glext.h to version 20140810
Brian Paul [Tue, 12 Aug 2014 13:30:52 +0000 (07:30 -0600)]
mesa: update glext.h to version 20140810

This brings in the new OpenGL 4.5 features.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agosvga: Add a limit to the maximum surface size
Charmaine Lee [Tue, 12 Aug 2014 13:37:12 +0000 (07:37 -0600)]
svga: Add a limit to the maximum surface size

This patch adds a limit to the maximum surface size which is
based on the maximum size of a single mob. If this value is not
available, the maximum surface size is by default set to 128 MB.

Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa/st: Move declaration to top of block.
José Fonseca [Tue, 12 Aug 2014 13:24:38 +0000 (14:24 +0100)]
mesa/st: Move declaration to top of block.

To fix MSVC build failure.

Trivial.

10 years agomesa/st: add support for dynamic sampler offsets
Ilia Mirkin [Wed, 6 Aug 2014 04:43:29 +0000 (00:43 -0400)]
mesa/st: add support for dynamic sampler offsets

Replace the plain sampler index with a register reference to a sampler.
We also need to keep track of the sampler array size when there is a
relative reference so that we can mark the whole array used.

To facilitate implementation, we add a separate ADDR register that
exclusively handles the sampler relative address. Other approaches would
be more invasive.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agoradeon/uvd: fix gpu_address for video surfaces
Christian König [Mon, 11 Aug 2014 14:40:43 +0000 (16:40 +0200)]
radeon/uvd: fix gpu_address for video surfaces

We need to get the new gpu_address as well when
reallocating the cs buffer.

Bug: https://bugs.freedesktop.org/show_bug.cgi?id=82428

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Kai Wasserbäch <kai@dev.carbon-project.org>
10 years agomesa: Add a new function for getting the nonconst sampler array index
Chris Forbes [Sun, 3 Aug 2014 07:55:55 +0000 (19:55 +1200)]
mesa: Add a new function for getting the nonconst sampler array index

If the array index is not a constant expression, the existing support
will assume a zero offset (giving us the sampler index of the base of
the array).

For dynamically uniform indexing of sampler arrays, we need both that
and the indexing expression.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agoglsl: Allow dynamically uniform sampler array indexing with 4.0/gs5
Chris Forbes [Sun, 3 Aug 2014 05:57:05 +0000 (17:57 +1200)]
glsl: Allow dynamically uniform sampler array indexing with 4.0/gs5

V2: Expand comment to explain what dynamically uniform expressions are
about.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agonvc0/ir: describe the tex arguments for fermi/kepler
Ilia Mirkin [Thu, 7 Aug 2014 03:45:05 +0000 (23:45 -0400)]
nvc0/ir: describe the tex arguments for fermi/kepler

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agonvc0/ir: add kepler+ support for indirect texture references
Ilia Mirkin [Wed, 9 Jul 2014 04:41:11 +0000 (00:41 -0400)]
nvc0/ir: add kepler+ support for indirect texture references

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agonvc0/ir: add base tex offset for fermi indirect tex case
Ilia Mirkin [Wed, 6 Aug 2014 05:22:49 +0000 (01:22 -0400)]
nvc0/ir: add base tex offset for fermi indirect tex case

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
10 years agoi965: Revert part of f5cc3fdcf1680b116612fac7c39f1bd79f5e555e.
Kenneth Graunke [Mon, 11 Aug 2014 22:05:54 +0000 (15:05 -0700)]
i965: Revert part of f5cc3fdcf1680b116612fac7c39f1bd79f5e555e.

Fixes non-termination in various Piglit tests.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
10 years agovc4: Flip which primitives are considered front-facing.
Eric Anholt [Sat, 9 Aug 2014 18:01:53 +0000 (11:01 -0700)]
vc4: Flip which primitives are considered front-facing.

This mostly fixes glxgears rendering.

10 years agovc4: Don't forget to set the depth clear value in the packet.
Eric Anholt [Sat, 9 Aug 2014 18:00:51 +0000 (11:00 -0700)]
vc4: Don't forget to set the depth clear value in the packet.

This gets glxgears partially rendering again.

10 years agovc4: Add support for gl_FragCoord.
Eric Anholt [Tue, 5 Aug 2014 21:24:29 +0000 (14:24 -0700)]
vc4: Add support for gl_FragCoord.

This isn't passing all tests (glsl-fs-fragcoord-zw-ortho, for example),
but it does get a bunch more tests passing.

v2: Rebase on helpers change.

10 years agovc4: Refactor shader input setup again.
Eric Anholt [Tue, 5 Aug 2014 21:23:40 +0000 (14:23 -0700)]
vc4: Refactor shader input setup again.

This makes some space for handling special inputs like fragcoords.

10 years agovc4: Clean up the tile alloc buffer size.
Eric Anholt [Tue, 5 Aug 2014 18:00:51 +0000 (11:00 -0700)]
vc4: Clean up the tile alloc buffer size.

This prevents some simulator assertion failures, but it does mean (since
I've dropped the "* 16" padding) that on real hardware you need a kernel
that does overflow memory management (currently, "drm/vc4: Add support for
binner overflow memory allocation." in my kernel tree).

10 years agovc4: Clarify some values implicitly chosen for binning config.
Eric Anholt [Tue, 5 Aug 2014 18:00:08 +0000 (11:00 -0700)]
vc4: Clarify some values implicitly chosen for binning config.

These #defines are 0, but it should help make math above make more sense.

10 years agovc4: Improve simulator memory allocation.
Eric Anholt [Tue, 5 Aug 2014 17:54:56 +0000 (10:54 -0700)]
vc4: Improve simulator memory allocation.

This should reduce a bunch of spurious failures in sim.

10 years agovc4: Handle stride==0 in VBO validation
Eric Anholt [Tue, 5 Aug 2014 01:30:33 +0000 (18:30 -0700)]
vc4: Handle stride==0 in VBO validation

10 years agovc4: Stash some debug code for looking at what BOs are at what hindex.
Eric Anholt [Mon, 4 Aug 2014 23:38:07 +0000 (16:38 -0700)]
vc4: Stash some debug code for looking at what BOs are at what hindex.

When you're debugging validation, it's nice to know what the BOs are for.

10 years agovc4: Use GEM under simulation even for non-winsys BOs.
Eric Anholt [Mon, 4 Aug 2014 20:01:29 +0000 (13:01 -0700)]
vc4: Use GEM under simulation even for non-winsys BOs.

In addition to reducing sim-specific code, it also avoids our local handle
allocation conflicting with the host GEM's handle numbering, which was
causing vc4_gem_hindex() to not distinguish between winsys BOs and the
same-numbered non-winsys bo.

10 years agovc4: Don't forget to unmap the GEM BO when freeing.
Eric Anholt [Mon, 4 Aug 2014 20:00:56 +0000 (13:00 -0700)]
vc4: Don't forget to unmap the GEM BO when freeing.

Otherwise it'll stick around forever.

10 years agovc4: Add validation of raster-format textures.
Eric Anholt [Sun, 3 Aug 2014 04:28:34 +0000 (21:28 -0700)]
vc4: Add validation of raster-format textures.

... and reject everything else, for now.

v2: Rebase on v2 of the rendering config validation change.

10 years agovc4: Drop VC4_PACKET_PRIMITIVE_LIST_FORMAT.
Eric Anholt [Sun, 3 Aug 2014 04:23:20 +0000 (21:23 -0700)]
vc4: Drop VC4_PACKET_PRIMITIVE_LIST_FORMAT.

It's not relevant to our command streams any more.

v2: Fix indentation and a typo in the comment.

10 years agovc4: Add validation that vertex indices don't overflow VBO bounds.
Eric Anholt [Sun, 3 Aug 2014 04:06:50 +0000 (21:06 -0700)]
vc4: Add validation that vertex indices don't overflow VBO bounds.

10 years agovc4: Fix the shader record size for extended strides.
Eric Anholt [Sun, 3 Aug 2014 03:44:39 +0000 (20:44 -0700)]
vc4: Fix the shader record size for extended strides.

It turns out they aren't packed when attributes are missing, according to
both docs and simulation.

10 years agovc4: Fix the shader record size for extended strides.
Eric Anholt [Sun, 3 Aug 2014 03:44:39 +0000 (20:44 -0700)]
vc4: Fix the shader record size for extended strides.

It turns out they aren't packed when attributes are missing, according to
both docs and simulation.

v2: Drop unused variable.

10 years agovc4: Add a bunch of validation of render mode configuration.
Eric Anholt [Sun, 3 Aug 2014 03:30:18 +0000 (20:30 -0700)]
vc4: Add a bunch of validation of render mode configuration.

v2: Fix a build break after some previous rebase.

10 years agovc4: Store the (currently always linear) tiling format in the resource.
Eric Anholt [Sun, 3 Aug 2014 03:19:38 +0000 (20:19 -0700)]
vc4: Store the (currently always linear) tiling format in the resource.

10 years agovc4: Add a bunch of validation of the binning mode config.
Eric Anholt [Sat, 2 Aug 2014 00:11:38 +0000 (17:11 -0700)]
vc4: Add a bunch of validation of the binning mode config.

10 years agovc4: Validate that the same BO doesn't get reused for different purposes.
Eric Anholt [Sat, 2 Aug 2014 03:23:31 +0000 (20:23 -0700)]
vc4: Validate that the same BO doesn't get reused for different purposes.

We don't care if things like vertex data get smashed by render target
data, but we do need to make sure that shader code doesn't get rendered
to.

v2: Fix overflowing read of gl_relocs[] that incorrect flagged of some
    VBOs as shader code.

10 years agovc4: Use the packet #defines in the kernel validation code.
Eric Anholt [Sat, 2 Aug 2014 00:31:40 +0000 (17:31 -0700)]
vc4: Use the packet #defines in the kernel validation code.

10 years agovc4: Rename GEM_HANDLES to be in a namespace.
Eric Anholt [Sat, 2 Aug 2014 00:17:03 +0000 (17:17 -0700)]
vc4: Rename GEM_HANDLES to be in a namespace.

It's not a real VC4 hardware packet, but I've put in a comment to explain
it.

10 years agovc4: Clean up TMU write validation.
Eric Anholt [Sat, 2 Aug 2014 00:05:21 +0000 (17:05 -0700)]
vc4: Clean up TMU write validation.

The comment conflicted with the support in the code, so I moved the TMU
write validation to where the comment was, and dropped some dead arguments
from the functions while changing their signatures.

10 years agovc4: Update a comment about shader validation
Eric Anholt [Sat, 2 Aug 2014 00:01:44 +0000 (17:01 -0700)]
vc4: Update a comment about shader validation

10 years agovc4: Add proper translation from Zc to Zs for vertex output.
Eric Anholt [Fri, 1 Aug 2014 23:02:37 +0000 (16:02 -0700)]
vc4: Add proper translation from Zc to Zs for vertex output.

This fixes the remaining failure in depthfunc.

10 years agovc4: Add support for depth clears and tests within a tile.
Eric Anholt [Fri, 1 Aug 2014 20:32:49 +0000 (13:32 -0700)]
vc4: Add support for depth clears and tests within a tile.

This doesn't load/store the Z contents across submits yet.  It also
disables early Z, since it's going to require tracking of Z functions
across multiple state updates to track the early Z direction and whether
it can be used.

v2: Move the key setup to before the search for the key.

10 years agovc4: Avoid flushing when mapping buffers that aren't in the batch.
Eric Anholt [Fri, 1 Aug 2014 22:33:06 +0000 (15:33 -0700)]
vc4: Avoid flushing when mapping buffers that aren't in the batch.

This should prevent a bunch of unnecessary flushes for things like
updating immediate vertex data.

10 years agovc4: Drop the flush at the end of the draw
Eric Anholt [Thu, 31 Jul 2014 05:17:56 +0000 (22:17 -0700)]
vc4: Drop the flush at the end of the draw

Now we actally get multiple draw calls per submit.

10 years agovc4: Align following shader recs to 16 bytes.
Eric Anholt [Fri, 1 Aug 2014 18:24:29 +0000 (11:24 -0700)]
vc4: Align following shader recs to 16 bytes.

Otherwise, the low address bits will end up being interpreted as attribute
counts.

10 years agovc4: Fix a potential src buffer overflow in shader rec validation.
Eric Anholt [Thu, 31 Jul 2014 20:14:00 +0000 (13:14 -0700)]
vc4: Fix a potential src buffer overflow in shader rec validation.

10 years agovc4: Keep a reference to BOs queued for rendering.
Eric Anholt [Thu, 31 Jul 2014 19:19:29 +0000 (12:19 -0700)]
vc4: Keep a reference to BOs queued for rendering.

Otherwise, once we're not flushing at the end of every draw, we'll free
things like gallium resources, and free the backing GEM object, before
we've flushed the rendering using it to the kernel.

10 years agovc4: Compute the proper end address of the relocated command lists.
Eric Anholt [Thu, 31 Jul 2014 19:46:13 +0000 (12:46 -0700)]
vc4: Compute the proper end address of the relocated command lists.

render_cl_size/bin_cl_size includes relocations, while the hardware buffer
doesn't.  If you don't emit a HALT packet, the command parser continues
until the end register's value.  We can't allow executing unvalidated
buffer contents (and it's actually harmful in the render lists Mesa is
emitting, since VC4_PACKET_STORE_MS_TILE_BUFFER_AND_EOF doesn't trigger a
halt).

10 years agovc4: Walk tiles horizontally, then vertically.
Eric Anholt [Thu, 31 Jul 2014 19:45:41 +0000 (12:45 -0700)]
vc4: Walk tiles horizontally, then vertically.

I was confused looking at my addresses in dumps because I was seeing the
tile branch offsets jumping all over.

10 years agovc4: Track clears veresus uncleared draws, and the clear color.
Eric Anholt [Wed, 23 Jul 2014 18:21:04 +0000 (11:21 -0700)]
vc4: Track clears veresus uncleared draws, and the clear color.

This is a step toward queueing more than one draw per frame.

Fixes piglit attribute0 test, since we get a working clear color now.

10 years agovc4: Move the rest of RCL setup to flush time.
Eric Anholt [Thu, 31 Jul 2014 18:23:22 +0000 (11:23 -0700)]
vc4: Move the rest of RCL setup to flush time.

We only want to set up render target config and clear colors once per
frame.

10 years agovc4: Move render command list calls to vc4_flush()
Eric Anholt [Thu, 31 Jul 2014 18:22:17 +0000 (11:22 -0700)]
vc4: Move render command list calls to vc4_flush()

10 years agovc4: Move bin command list ending commands to vc4_flush()
Eric Anholt [Thu, 31 Jul 2014 18:19:41 +0000 (11:19 -0700)]
vc4: Move bin command list ending commands to vc4_flush()

10 years agovc4: Rename fields in the kernel interface.
Eric Anholt [Wed, 23 Jul 2014 03:16:10 +0000 (20:16 -0700)]
vc4: Rename fields in the kernel interface.

I decided I didn't like "len" compared to "size", and I keep typing
shader_rec instead of shader_record[s] elsewhere, so make it consistent.

10 years agovc4: Fix things to validate more than one shader state in a submit.
Eric Anholt [Wed, 23 Jul 2014 03:10:01 +0000 (20:10 -0700)]
vc4: Fix things to validate more than one shader state in a submit.

10 years agovc4: Rewrite the kernel ABI to support texture uniform relocation.
Eric Anholt [Mon, 21 Jul 2014 18:27:35 +0000 (11:27 -0700)]
vc4: Rewrite the kernel ABI to support texture uniform relocation.

This required building a shader parser that would walk the program to find
where the texturing-related uniforms are in the uniforms stream.

Note that as of this commit, a new kernel is required for rendering on
actual VC4 hardware (currently that commit is named "drm/vc4: Introduce
shader validation and better command stream validation.", but is likely to
be squashed as part of an eventual merge of the kernel driver).

10 years agovc4: Add docs for the drm interface
Eric Anholt [Mon, 21 Jul 2014 18:26:24 +0000 (11:26 -0700)]
vc4: Add docs for the drm interface

10 years agovc4: Add load/store to the validator
Eric Anholt [Fri, 18 Jul 2014 21:18:23 +0000 (14:18 -0700)]
vc4: Add load/store to the validator

10 years agovc4: Switch simulator to using kernel validator
Eric Anholt [Fri, 18 Jul 2014 20:06:01 +0000 (13:06 -0700)]
vc4: Switch simulator to using kernel validator

This ensures that when I'm using the simulator, I get a closer match to
what behavior on real hardware will be.  It lets me rapidly iterate on the
kernel validation code (which otherwise has a several-minute turnaround
time), and helps catch buffer overflow bugs in the userspace driver
faster.

10 years agovc4: Drop pointless shader state struct
Eric Anholt [Fri, 18 Jul 2014 20:28:34 +0000 (13:28 -0700)]
vc4: Drop pointless shader state struct

10 years agovc4: Add support for texture rectangles
Eric Anholt [Thu, 17 Jul 2014 04:39:05 +0000 (21:39 -0700)]
vc4: Add support for texture rectangles

v2: Rebase on helpers change.

10 years agovc4: Add support for texturing (under simulation)
Eric Anholt [Tue, 15 Jul 2014 19:29:32 +0000 (12:29 -0700)]
vc4: Add support for texturing (under simulation)

Only rgba8888 works, and only a single texture unit, and it's only under
simulation because I haven't built the kernel interface yet.

v2: Rebase on helpers.
v3: Fold in the don't-break-the-arm-build fix.