Courtney Goeltzenleuchter [Mon, 4 Nov 2013 20:31:37 +0000 (13:31 -0700)]
mesa: Update TexStorage to support ARB_texture_view
Call TextureView helper function to set TextureView state
appropriately for the TexStorage calls.
Misc updates from review feedback.
Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Courtney Goeltzenleuchter [Wed, 6 Nov 2013 21:40:31 +0000 (14:40 -0700)]
mesa: add texture_view helper function for TexStorage
Add helper function to set texture_view state from TexStorage calls.
Include review feedback.
Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Courtney Goeltzenleuchter [Wed, 6 Nov 2013 21:34:09 +0000 (14:34 -0700)]
mesa: Fill out ARB_texture_view entry points
Add Mesa TextureView logic.
Incorporate feedback on ARB_texture_view:
- Add S3TC VIEW_CLASSes to compatibility table
- Use existing _mesa_get_tex_image
- Clean up error strings
- Use bool instead of GLboolean for internal functions
- Split compound level & layer test into individual tests
- eliminate helper macro for VIEW_CLASS table
- do not call driver if ptr null.
Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Courtney Goeltzenleuchter [Mon, 25 Nov 2013 23:31:26 +0000 (16:31 -0700)]
mesa: consolidate multiple next_mipmap_level_size
Refactor to make next_mipmap_level_size defined in mipmap.c a
_mesa_ helper function that can then be used by texture_view
Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Courtney Goeltzenleuchter [Mon, 4 Nov 2013 21:09:22 +0000 (14:09 -0700)]
mesa: Add driver entry point for ARB_texture_view
Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Courtney Goeltzenleuchter [Mon, 4 Nov 2013 20:29:48 +0000 (13:29 -0700)]
mesa: ARB_texture_view get parameters
Add support for ARB_texture_view get parameters:
GL_TEXTURE_VIEW_MIN_LEVEL
GL_TEXTURE_VIEW_NUM_LEVELS
GL_TEXTURE_VIEW_MIN_LAYER
GL_TEXTURE_VIEW_NUM_LAYERS
Incorporate feedback regarding when to allow query of
GL_TEXTURE_IMMUTABLE_LEVELS.
Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Courtney Goeltzenleuchter [Mon, 4 Nov 2013 20:22:30 +0000 (13:22 -0700)]
mesa: update texture object for ARB_texture_view
Add state needed by glTextureView to the gl_texture_object.
Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Courtney Goeltzenleuchter [Mon, 4 Nov 2013 20:18:55 +0000 (13:18 -0700)]
mesa: Tracking for ARB_texture_view extension
Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Courtney Goeltzenleuchter [Mon, 4 Nov 2013 21:08:16 +0000 (14:08 -0700)]
mesa: Add API definitions for ARB_texture_view
Stub in glTextureView API call to go with the
glTextureView API xml definition.
Includes dispatch test for glTextureView
Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Anuj Phogat [Thu, 12 Dec 2013 22:34:27 +0000 (14:34 -0800)]
mesa: Fix error code generation in glBeginConditionalRender()
This patch changes the error condition to satisfy below statement
from OpenGL 4.3 core specification:
"An INVALID_OPERATION error is generated if id is the name of a query
object with a target other SAMPLES_PASSED, ANY_SAMPLES_PASSED, or
ANY_SAMPLES_PASSED_CONSERVATIVE, or if id is the name of a query
currently in progress."
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Carl Worth [Fri, 13 Dec 2013 05:33:02 +0000 (21:33 -0800)]
Makefile: Add bin/test-driver to EXTRA_FILES
I'm not sure why this change is necessary. When I've built previous tar files
(such as 9.2.4) with the "make tarballs" target, they include the
bin/test-driver file. But at my first attempt to build the tar files for the
10.0.1 release this file was not being included and the build failed.
(cherry picked from commit
d573899b932435b0b37a7a33ebcbdc3c8cedd3e1)
[The cherry pick is because I original applied this on the 10.0 branch while
working on the 10.0.1 release. But if we don't have this on master as well,
this issue will trip us up again the next time we make a new major-release
branch off of master.]
Kristian Høgsberg [Sun, 8 Dec 2013 06:02:11 +0000 (22:02 -0800)]
dri_util: Don't assume __DRIcontext->driverPrivate is a gl_context
The driverPrivate pointer is opaque to the driver and we can't assume
it's a struct gl_context in dri_util.c. Instead provide a helper function
to set the struct gl_context flags from the incoming DRI context flags.
v2 (idr): Modify the other classic drivers to also use
driContextSetFlags. I ran all the piglit GLX_ARB_create_context tests
with i965 and classic swrast without regressions.
Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> [v1 on Gallium nouveau]
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Carl Worth [Fri, 13 Dec 2013 07:07:26 +0000 (23:07 -0800)]
docs: Update note regarding nominating patches for the stable branch.
This brings the documentation up to date with the current practice of using
the CC syntax for patch nomination.
Carl Worth [Fri, 13 Dec 2013 07:02:54 +0000 (23:02 -0800)]
docs: Fix typo
Simply replacing Extentions with the correct Extensions.
Carl Worth [Fri, 13 Dec 2013 06:58:40 +0000 (22:58 -0800)]
docs: Import 9.2.5 release notes, add news item.
Carl Worth [Fri, 13 Dec 2013 06:21:08 +0000 (22:21 -0800)]
docs: Import 10.0.1 release notes, add news item.
Dave Airlie [Thu, 28 Nov 2013 01:08:11 +0000 (11:08 +1000)]
swrast* (gallium, classic): add MESA_copy_sub_buffer support (v3)
This patches add MESA_copy_sub_buffer support to the dri sw loader and
then to gallium state tracker, llvmpipe, softpipe and other bits.
It reuses the dri1 driver extension interface, and it updates the swrast
loader interface for a new putimage which can take a stride.
I've tested this with gnome-shell with a cogl hacked to reenable sub copies
for llvmpipe and the one piglit test.
I could probably split this patch up as well.
v2: pass a pipe_box, to reduce the entrypoints, as per Jose's review,
add to p_screen doc comments.
v3: finish off winsys interfaces, add swrast classic support as well.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
swrast: add support for copy_sub_buffer
Brian Paul [Thu, 12 Dec 2013 18:11:31 +0000 (11:11 -0700)]
util: fix compile breakage
D'oh!
Brian Paul [Thu, 12 Dec 2013 18:08:49 +0000 (11:08 -0700)]
util: move variable declaration out of for-loop
To fix MSVC build.
Marek Olšák [Wed, 4 Dec 2013 11:14:14 +0000 (12:14 +0100)]
gallium/util: implement new color clear API in u_blitter
Marek Olšák [Wed, 4 Dec 2013 00:24:37 +0000 (01:24 +0100)]
st/mesa: set correct PIPE_CLEAR_COLORn flags
This also fixes the clear_with_quad function for glClearBuffer.
Marek Olšák [Tue, 3 Dec 2013 23:56:24 +0000 (00:56 +0100)]
gallium: allow choosing which colorbuffers to clear
Required for glClearBuffer, which only clears one colorbuffer attachment.
Example:
If the first colorbuffer is float and the second one is int:
pipe->clear(pipe, PIPE_CLEAR_COLOR0, float_clear_color, ...);
pipe->clear(pipe, PIPE_CLEAR_COLOR1, int_clear_color, ...);
This doesn't need any driver changes yet, because all drivers just use:
if (flags & PIPE_CLEAR_COLOR) ..
The drivers which support GL 3.0 will have to implement it properly though.
Marek Olšák [Tue, 3 Dec 2013 23:39:52 +0000 (00:39 +0100)]
st/mesa: fix glClear with multiple colorbuffers and different formats
Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>
Marek Olšák [Tue, 3 Dec 2013 23:27:20 +0000 (00:27 +0100)]
mesa: fix interpretation of glClearBuffer(drawbuffer)
This corresponding piglit tests supported this incorrect behavior instead of
pointing at it.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>
Marek Olšák [Tue, 3 Dec 2013 23:25:55 +0000 (00:25 +0100)]
docs/GL3: better documentation of GL 3.0
Marek Olšák [Wed, 4 Dec 2013 20:48:26 +0000 (21:48 +0100)]
r600g,radeonsi: fix initialized buffer range tracking for DMA, add comments
The DMA functions modify dst_offset and size and util_range_add gets wrong
values.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Wed, 4 Dec 2013 12:54:50 +0000 (13:54 +0100)]
radeonsi: fix binding the dummy pixel shader
This fixes valgrind errors in glxinfo.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Wed, 4 Dec 2013 12:24:22 +0000 (13:24 +0100)]
radeonsi: fix FS_COLOR0_WRITES_ALL_CBUFS with mixed colorbuffer formats
The 16bpc packing must be done separately for each render target.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Wed, 4 Dec 2013 11:40:28 +0000 (12:40 +0100)]
radeonsi: use the colorbuffer count from the shader key
As a result, the initialization of write_all must be done before
the compilation.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Wed, 4 Dec 2013 11:28:29 +0000 (12:28 +0100)]
radeonsi: remove unused variable in si_pipe_shader_ps
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Andreas Hartmetz [Sat, 7 Dec 2013 03:42:24 +0000 (04:42 +0100)]
radeonsi: Write htile state to hardware.
Andreas Hartmetz [Sat, 7 Dec 2013 01:15:27 +0000 (02:15 +0100)]
radeon: Allocate htile buffer for SI in r600_texture.
Andreas Hartmetz [Sat, 7 Dec 2013 01:08:27 +0000 (02:08 +0100)]
radeon: rearrange r600_texture and related code a bit.
This should make the differences and similarities between color and
depth buffer handling more clear.
Marek Olšák [Fri, 29 Nov 2013 16:28:23 +0000 (17:28 +0100)]
r600g,radeonsi: consolidate buffer code, add handling of DISCARD_RANGE for SI
This adds 2 optimizations for radeonsi:
- handling of DISCARD_RANGE
- mapping an uninitialized buffer range is automatically UNSYNCHRONIZED
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Fri, 29 Nov 2013 15:26:36 +0000 (16:26 +0100)]
r600g,radeonsi: add common interface for buffer invalidation
This will be used by common code in the next commit.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Fri, 29 Nov 2013 15:05:45 +0000 (16:05 +0100)]
r600g,radeonsi: consolidate some debug flags
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Fri, 29 Nov 2013 15:02:12 +0000 (16:02 +0100)]
r600g: refactor out code for buffer invalidation
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Thu, 28 Nov 2013 14:09:35 +0000 (15:09 +0100)]
r600g,radeonsi: share flags has_cp_dma and has_streamout
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Wed, 27 Nov 2013 23:20:47 +0000 (00:20 +0100)]
radeonsi: handle PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE
which can come from glBufferData and glMapBufferRange.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Wed, 27 Nov 2013 12:27:54 +0000 (13:27 +0100)]
radeonsi: implement accelerated buffer copying
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Wed, 27 Nov 2013 11:43:40 +0000 (12:43 +0100)]
r600g: use common interfaces in buffer_transfer_unmap
i.e. dma_copy and resource_copy_region.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Tue, 26 Nov 2013 22:33:20 +0000 (23:33 +0100)]
radeon: move some functions to r600_buffer_common.c
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christoph Brill <egore911@gmail.com>
v2: Renamed r600_buffer.c to r600_buffer_common.c. The stupid build system
doesn't allow 2 files of the same name in different directories.
Marek Olšák [Tue, 26 Nov 2013 21:59:31 +0000 (22:59 +0100)]
winsys/radeon: set/get the scanout flag with the tiling ioctls
If we assume that all buffers allocated by the DDX are scanout, a new flag
that says "this is not scanout" has to be added to support the non-scanout
buffers and maintain backward compatibility.
This fixes bad rendering on Wayland.
The flag is defined as:
#define RADEON_TILING_R600_NO_SCANOUT RADEON_TILING_SWAP_16BIT
AFAIK, RADEON_TILING_SWAP_16BIT is not used on SI.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tapani Pälli [Thu, 12 Dec 2013 15:13:32 +0000 (17:13 +0200)]
glsl: modify ir_clone to use memcpy
Patch copies the whole data structure at once instead of
assigning individual variables.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tapani Pälli [Thu, 12 Dec 2013 13:08:59 +0000 (15:08 +0200)]
glsl: move variables in to ir_variable::data, part II
This patch moves following bitfields and variables to the data
structure:
explicit_location, explicit_index, explicit_binding, has_initializer,
is_unmatched_generic_inout, location_frac, from_named_ifc_block_nonarray,
from_named_ifc_block_array, depth_layout, location, index, binding,
max_array_access, atomic
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tapani Pälli [Thu, 12 Dec 2013 11:51:01 +0000 (13:51 +0200)]
glsl: move variables in to ir_variable::data, part I
This patch moves following bitfields in to the data structure:
used, assigned, how_declared, mode, interpolation,
origin_upper_left, pixel_center_integer
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tapani Pälli [Thu, 12 Dec 2013 10:57:57 +0000 (12:57 +0200)]
glsl: introduce data section to ir_variable
Data section helps serialization and cloning of a ir_variable. This
patch includes the helper bits used for read only ir_variables.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Tapani Pälli [Tue, 10 Dec 2013 12:28:40 +0000 (14:28 +0200)]
mesa: fix a typo in glDetachShader error message
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Brian Paul [Mon, 9 Dec 2013 18:46:56 +0000 (10:46 -0800)]
svga: expose HW smooth/stipple/wide lines
Newer virtual HW versions support smooth/stipple/wide lines.
Use that instead of 'draw' fallbacks when possible.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Juha-Pekka Heikkila [Wed, 11 Dec 2013 09:06:00 +0000 (02:06 -0700)]
glx: Add missing null check in DRI2WireToEvent
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Matthew McClure [Tue, 10 Dec 2013 21:10:03 +0000 (13:10 -0800)]
llvmpipe: add plumbing for ARB_depth_clamp
With this patch llvmpipe will adhere to the ARB_depth_clamp enabled state when
clamping the fragment's zw value. To support this, the variant key now includes
the depth_clamp state. key->depth_clamp is derived from pipe_rasterizer_state's
(depth_clip == 0), thus depth clamp is only enabled when depth clip is disabled.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Vadim Girlin [Wed, 11 Dec 2013 00:08:32 +0000 (04:08 +0400)]
r600g/sb: fix stack size computation on evergreen
On evergreen we have to reserve 1 stack element in some additional cases
besides the ones mentioned in the docs, but stack size computation was
recently reimplemented exactly as described in the docs by the patch that
added workarounds for stack issues on EG/CM, resulting in regressions
with some apps (Serious Sam 3).
This patch fixes it by restoring previous behavior.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=72369
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Andre Heider <a.heider@gmail.com>
Zack Rusin [Tue, 10 Dec 2013 05:10:28 +0000 (00:10 -0500)]
llvmpipe: add a very useful (disabled) debugging output
Disabled by default, but it's very useful when needed.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Zack Rusin [Tue, 10 Dec 2013 05:06:48 +0000 (00:06 -0500)]
draw: fix vbuf caching of vertices with inject front face
Caching in the vbuf module meant that once a vertex has been
emitted it was cached, but it's possible for a vertex at the
same location to be emitted again, but this time with a different
front-face semantic. Caching was causing the first version of the
vertex to be emitted, which resulted in the renderer getting
incorrect front-face attributes. By reseting the vertex_id (which
is used for caching) we make sure that once a front-face info
has been injected the vertex will endup getting emitted.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Zack Rusin [Fri, 6 Dec 2013 06:28:25 +0000 (01:28 -0500)]
llvmpipe: fix blending with half-float formats
The fact that we flush denorms to zero breaks our half-float
conversion and blending. This patches enables denorms for
blending. It's a little tricky due to the llvm bug that makes
it incorrectly reorder the mxcsr intrinsics:
http://llvm.org/bugs/show_bug.cgi?id=6393
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Zack Rusin <zackr@vmware.com>
Thomas Hellstrom [Thu, 21 Nov 2013 04:11:46 +0000 (15:11 +1100)]
svga/winsys: Implement surface sharing using prime fd handles
This needs a prime-aware vmwgfx kernel module to work properly.
(With additions by Christopher James Halse Rogers <raof@ubuntu.com>)
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Christopher James Halse Rogers [Mon, 25 Nov 2013 03:59:10 +0000 (14:59 +1100)]
gallium/radeon: Implement hooks for DRI Image 7 (v2)
v2: Fix transliteration of lseek arguments
Ignore busy return from RADEON_GEM_BUSY ioctl; we're only after the domain
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Christopher James Halse Rogers [Thu, 21 Nov 2013 04:11:44 +0000 (15:11 +1100)]
radeon: Rename bo_handles hashtable to match its actual contents.
It's a map of GEM name->bo, so identify it as such
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Christopher James Halse Rogers [Thu, 21 Nov 2013 04:11:43 +0000 (15:11 +1100)]
ilo: Support DRI Image 7
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Maarten Lankhorst [Thu, 21 Nov 2013 04:11:42 +0000 (15:11 +1100)]
nouveau: Support DRI Image 7 extension
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Christopher James Halse Rogers [Mon, 25 Nov 2013 03:57:37 +0000 (14:57 +1100)]
gallium/dri: Support DRI Image extension version 7
v2: Fix up queryImage return for ATTRIB_FD
Use driver_descriptor.configuration to determine whether the driver
supports DMA-BUF import/export.
v3: Really, truly, fix up queryImage return for ATTRIB_FD
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Christopher James Halse Rogers [Thu, 21 Nov 2013 04:11:40 +0000 (15:11 +1100)]
gallium/dri2: Set winsys_handle type to KMS for stride query.
Otherwise the default is TYPE_SHARED, which will flink the bo. This seems
rather unnecessary for a simple stride query.
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Christopher James Halse Rogers [Thu, 21 Nov 2013 04:11:39 +0000 (15:11 +1100)]
gallium/winsys/drm: Prepare for passing prime fds in winsys_handle
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Christopher James Halse Rogers [Tue, 26 Nov 2013 07:44:05 +0000 (18:44 +1100)]
gallium/dri: Support DRI Image extension version 6
v2: Pick out the correct gl_context pointer
v3: Don't leak pipe_resources on error path
Set img->dri_format correctly
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
Reviewed-by: Thomas Hellstrom <thellstrom@vmware.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Ilia Mirkin [Sun, 1 Dec 2013 08:44:42 +0000 (03:44 -0500)]
nv50: report 15 max inputs for fragment programs
First off, nv50_program only has 16 in/out varyings. However reporting
16 makes 'm' become 68 in nv50_fp_linkage_validate with the
varying-packing-simple piglit test. (Subverting the assert makes it
compile but fail.) With this patch, varying-packing-simple passes.
See: https://bugs.freedesktop.org/show_bug.cgi?id=69155
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "9.2 10.0" <mesa-stable@lists.freedesktop.org>
Maarten Lankhorst [Tue, 10 Dec 2013 07:43:03 +0000 (08:43 +0100)]
nouveau: Fix compiler warning regression
cfg is now unused, remove it.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Dave Airlie [Thu, 5 Dec 2013 03:30:17 +0000 (13:30 +1000)]
swrast: fix readback regression since inversion fix
This readback from the frontbuffer with swrast was broken, that bug
just made it more obviously broken, this fixes it by inverting the
sub image gets. Also fixes a few other piglits.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=72327
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=72325
(for 9.2 the patches this depends on were asked to be backported separately
in an email).
Cc: "9.2" "10.0" mesa-stable@lists.fedoraproject.org
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Jordan Justen [Fri, 6 Dec 2013 10:21:17 +0000 (02:21 -0800)]
dri megadriver_stub: add compatibility for older DRI loaders
To help the transition period when DRI loaders are being updated
to support the newer __driDriverExtensions_foo mechanism,
we populate __driDriverExtensions with the extensions returned
by __driDriverExtensions_foo during a library contructor
function.
We find the driver foo's name by using the dladdr function
which gives the path of the dynamic library's name that
was being loaded.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Keith Packard <keithp@keithp.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Kristian Høgsberg [Tue, 10 Dec 2013 00:13:35 +0000 (16:13 -0800)]
egl/wayland: Return -1 from get_back_bo to indicate error
A return value of -1 indicate failure to allocate the back buffer and
means we don't segfault on the way out.
Neil Roberts [Wed, 11 Sep 2013 18:28:32 +0000 (19:28 +0100)]
egl_dri2: Remove the unused swap_interval member of dri2_egl_surface
The _EGLSurface struct which is embedded into dri2_egl_surface also contains a
swap interval member so the other member is redundant. Nothing was using it as
far as I can tell.
Kenneth Graunke [Mon, 25 Nov 2013 21:53:33 +0000 (13:53 -0800)]
i965: Replace OUT_RELOC_FENCED with OUT_RELOC.
On Gen4+, OUT_RELOC_FENCED is equivalent to OUT_RELOC; libdrm silently
ignores the fenced flag:
/* We never use HW fences for rendering on 965+ */
if (bufmgr_gem->gen >= 4)
need_fence = false;
Thanks to Eric for noticing this.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Paul Berry [Fri, 29 Nov 2013 08:52:11 +0000 (00:52 -0800)]
glsl/loops: Get rid of lower_bounded_loops and ir_loop::normative_bound.
Now that loop_controls no longer creates normatively bound loops,
there is no need for ir_loop::normative_bound or the
lower_bounded_loops pass.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Fri, 29 Nov 2013 08:16:43 +0000 (00:16 -0800)]
glsl/loops: Stop creating normatively bound loops in loop_controls.
Previously, when loop_controls analyzed a loop and found that it had a
fixed bound (known at compile time), it would remove all of the loop
terminators and instead set the loop's normative_bound field to force
the loop to execute the correct number of times.
This made loop unrolling easy, but it had a serious disadvantage.
Since most GPU's don't have a native mechanism for executing a loop a
fixed number of times, in order to implement the normative bound, the
back-ends would have to synthesize a new loop induction variable. As
a result, many loops wound up having two induction variables instead
of one. This caused extra register pressure and unnecessary
instructions.
This patch modifies loop_controls so that it doesn't set the loop's
normative_bound anymore. Instead it leaves one of the terminators in
the loop (the limiting terminator), so the back-end doesn't have to go
to any extra work to ensure the loop terminates at the right time.
This complicates loop unrolling slightly: when deciding whether a loop
can be unrolled, we have to account for the presence of the limiting
terminator. And when we do unroll the loop, we have to remove the
limiting terminator first.
For an example of how this results in more efficient back end code,
consider the loop:
for (int i = 0; i < 100; i++) {
total += i;
}
Previous to this patch, on i965, this loop would compile down to this
(vec4) native code:
mov(8) g4<1>.xD 0D
mov(8) g8<1>.xD 0D
loop:
cmp.ge.f0(8) null g8<4;4,1>.xD 100D
(+f0) if(8)
break(8)
endif(8)
add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD
add(8) g8<1>.xD g8<4;4,1>.xD 1D
add(8) g4<1>.xD g4<4;4,1>.xD 1D
while(8) loop
(notice that both g8 and g4 are loop induction variables; one is used
to terminate the loop, and the other is used to accumulate the total).
After this patch, the same loop compiles to:
mov(8) g4<1>.xD 0D
loop:
cmp.ge.f0(8) null g4<4;4,1>.xD 100D
(+f0) if(8)
break(8)
endif(8)
add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD
add(8) g4<1>.xD g4<4;4,1>.xD 1D
while(8) loop
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Fri, 29 Nov 2013 08:11:12 +0000 (00:11 -0800)]
glsl/loops: Get rid of loop_variable_state::max_iterations.
This value is now redundant with
loop_variable_state::limiting_terminator->iterations and
ir_loop::normative_bound.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Fri, 29 Nov 2013 06:12:08 +0000 (22:12 -0800)]
glsl/loops: Simplify loop unrolling logic by breaking into functions.
The old logic of loop_unroll_visitor::visit_leave(ir_loop *) was:
heuristics to skip unrolling in various circumstances;
if (loop contains more than one jump)
return;
else if (loop contains one jump) {
if (the jump is an unconditional "break" at the end of the loop) {
remove the break and set iteration count to 1;
fall through to simple loop unrolling code;
} else {
for (each "if" statement in the loop body)
see if the jump is a "break" at the end of one of its forks;
if (the "break" wasn't found)
return;
splice the remainder of the loop into the other fork of the "if";
remove the "break";
complex loop unrolling code;
return;
}
}
simple loop unrolling code;
return;
These tasks have been moved to their own functions:
- splice the remainder of the loop into the other fork of the "if"
- simple loop unrolling code
- complex loop unrolling code
And the logic has been flattened to:
heuristics to skip unrolling in various circumstances;
if (loop contains more than one jump)
return;
if (loop contains no jumps) {
simple loop unroll;
return;
}
if (the jump is an unconditional "break" at the end of the loop) {
remove the break;
simple loop unroll with iteration count of 1;
return;
}
for (each "if" statement in the loop body) {
if (the jump is a "break" at the end of one of its forks) {
splice the remainder of the loop into the other fork of the "if";
remove the "break";
complex loop unroll;
return;
}
}
This will make it easier to modify the loop unrolling algorithm in a
future patch.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Thu, 28 Nov 2013 22:40:19 +0000 (14:40 -0800)]
glsl/loops: Move some analysis from loop_controls to loop_analysis.
Previously, the sole responsibility of loop_analysis was to find all
the variables referenced in the loop that are either loop constant or
induction variables, and find all of the simple if statements that
might terminate the loop. The remainder of the analysis necessary to
determine how many times a loop executed was performed by
loop_controls.
This patch makes loop_analysis also responsible for determining the
number of iterations after which each loop terminator will terminate
the loop, and for figuring out which terminator will terminate the
loop first (I'm calling this the "limiting terminator").
This will allow loop unrolling to make use of information that was
previously only visible from loop_controls, namely the identity of the
limiting terminator.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Thu, 28 Nov 2013 22:46:38 +0000 (14:46 -0800)]
glsl/loops: Allocate loop_terminator using new(mem_ctx) syntax.
Patches to follow will introduce code into the loop_terminator
constructor. Allocating loop_terminator using new(mem_ctx) syntax
will ensure that the constructor runs.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Thu, 28 Nov 2013 20:44:53 +0000 (12:44 -0800)]
glsl/loops: Remove unnecessary list walk from loop_control_visitor.
When loop_control_visitor::visit_leave(ir_loop *) is analyzing a loop
terminator that acts on a certain ir_variable, it doesn't need to walk
the list of induction variables to find the loop_variable entry
corresponding to the variable. It can just look it up in the
loop_variable_state hashtable and verify that the loop_variable entry
represents an induction variable.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Thu, 28 Nov 2013 20:17:54 +0000 (12:17 -0800)]
glsl/loops: Remove unused fields iv_scale and biv from loop_variable class.
These fields were part of some planned optimizations that never
materialized. Remove them for now to simplify things; if we ever get
round to adding the optimizations that would require them, we can
always re-introduce them.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Thu, 28 Nov 2013 16:13:41 +0000 (08:13 -0800)]
glsl/loops: replace loop controls with a normative bound.
This patch replaces the ir_loop fields "from", "to", "increment",
"counter", and "cmp" with a single integer ("normative_bound") that
serves the same purpose.
I've used the name "normative_bound" to emphasize the fact that the
back-end is required to emit code to prevent the loop from running
more than normative_bound times. (By contrast, an "informative" bound
would be a bound that is informational only).
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Thu, 28 Nov 2013 01:57:19 +0000 (17:57 -0800)]
glsl/loops: consolidate bounded loop handling into a lowering pass.
Previously, all of the back-ends (ir_to_mesa, st_glsl_to_tgsi, and the
i965 fs and vec4 visitors) had nearly identical logic for handling
bounded loops. This replaces the duplicate logic with an equivalent
lowering pass that is used by all the back-ends.
Note: on i965, there is a slight increase in instruction count. For
example, a loop like this:
for (int i = 0; i < 100; i++) {
total += i;
}
would previously compile down to this (vec4) native code:
mov(8) g4<1>.xD 0D
mov(8) g8<1>.xD 0D
loop:
cmp.ge.f0(8) null g8<4;4,1>.xD 100D
(+f0) break(8)
add(8) g5<1>.xD g5<4;4,1>.xD g4<4;4,1>.xD
add(8) g8<1>.xD g8<4;4,1>.xD 1D
add(8) g4<1>.xD g4<4;4,1>.xD 1D
while(8) loop
After this patch, the "(+f0) break(8)" turns into:
(+f0) if(8)
break(8)
endif(8)
because the back-end isn't smart enough to recognize that "if
(condition) break;" can be done using a conditional break instruction.
However, it should be relatively easy for a future peephole
optimization to properly optimize this.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Thu, 28 Nov 2013 19:11:17 +0000 (11:11 -0800)]
glsl: In loop analysis, handle unconditional second assignment.
Previously, loop analysis would set
this->conditional_or_nested_assignment based on the most recently
visited assignment to the variable. As a result, if a vaiable was
assigned to more than once in a loop, the flag might be set
incorrectly. For example, in a loop like this:
int x;
for (int i = 0; i < 3; i++) {
if (i == 0)
x = 10;
...
x = 20;
...
}
loop analysis would have incorrectly concluded that all assignments to
x were unconditional.
In practice this was a benign bug, because
conditional_or_nested_assignment is only used to disqualify variables
from being considered as loop induction variables or loop constant
variables, and having multiple assignments also disqualifies a
variable from being considered as either of those things.
Still, we should get the analysis correct to avoid future confusion.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Thu, 28 Nov 2013 19:06:43 +0000 (11:06 -0800)]
glsl: Fix handling of function calls inside nested loops.
Previously, when visiting an ir_call, loop analysis would only mark
the innermost enclosing loop as containing a call. As a result, when
encountering a loop like this:
for (i = 0; i < 3; i++) {
for (int j = 0; j < 3; j++) {
foo();
}
}
it would incorrectly conclude that the outer loop ran three times.
(This is not certain; if foo() modifies i, then the outer loop might
run more or fewer times).
Fixes piglit test "vs-call-in-nested-loop.shader_test".
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Thu, 28 Nov 2013 18:48:37 +0000 (10:48 -0800)]
glsl: Fix loop analysis of nested loops.
Previously, when visiting a variable dereference, loop analysis would
only consider its effect on the innermost enclosing loop. As a
result, when encountering a loop like this:
for (int i = 0; i < 3; i++) {
for (int j = 0; j < 3; j++) {
...
i = 2;
}
}
it would incorrectly conclude that the outer loop ran three times.
Fixes piglit test "vs-inner-loop-modifies-outer-loop-var.shader_test".
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Thu, 28 Nov 2013 18:42:01 +0000 (10:42 -0800)]
glsl: Extract functions from loop_analysis::visit(ir_dereference_variable *).
This function is about to get more complex.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Tue, 3 Dec 2013 20:00:49 +0000 (12:00 -0800)]
i965/gen7+: Implement fast color clears for MSAA buffers.
Fast color clears of MSAA buffers work just like fast color clears
with non-MSAA buffers, except that the alignment and scaledown
requirements are different.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Paul Berry [Tue, 3 Dec 2013 19:40:23 +0000 (11:40 -0800)]
i965/blorp: Refactor code for computing fast clear align/scaledown factors.
This will make it easier to add fast color clear support to MSAA
buffers, since they have different alignment and scaling requirements.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Tue, 3 Dec 2013 16:48:41 +0000 (08:48 -0800)]
i965/blorp: allow multisample blorp clears
Previously, we didn't do multisample blorp clears because we couldn't
figure out how to get them to work. The reason for this was because
we weren't setting the brw_blorp_params num_samples field consistently
with dst.num_samples. Now that those two fields have been collapsed
down into one, we can do multisample blorp clears.
However, we need to do a few other pieces of bookkeeping to make them
work correctly in all circumstances:
- Since blorp clears may now operate on multisampled window system
framebuffers, they need to call
intel_renderbuffer_set_needs_downsample() to ensure that a
downsample happens before buffer swap (or glReadPixels()).
- When clearing a layered multisample buffer attachment using UMS or
CMS layout, we need to advance layer by multiples of num_samples
(since each logical layer is associated with num_samples physical
layers).
Note: we still don't do multisample fast color clears; more work needs
to be done to enable those.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Tue, 3 Dec 2013 17:44:46 +0000 (09:44 -0800)]
i965/blorp: Get rid of redundant num_samples blorp param.
Previously, brw_blorp_params contained two fields for determining
sample count: num_samples (which determined the multisample
configuration of the rendering pipeline) and dst.num_samples (which
determined the multisample configuration of the render target
surface). This was redundant, since both fields had to be set to the
same value to avoid rendering errors.
This patch eliminates num_samples to avoid future confusion.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Tue, 3 Dec 2013 16:33:41 +0000 (08:33 -0800)]
i965/gen7+: Disentangle MSAA layout from fast clear state.
This patch renames the enum that's used to keep track of fast clear
state from "mcs_state" to "fast_clear_state", and it removes the enum
value INTEL_MCS_STATE_MSAA (which previously meant, "this is an MSAA
buffer, so we're not keeping track of fast clear state"). The only
real purpose that enum value was serving was to prevent us from trying
to do fast clear resolves on MSAA buffers, and it's just as easy to
prevent that by checking the buffer's msaa_layout.
This paves the way for implementing fast clears of MSAA buffers.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Tue, 3 Dec 2013 23:41:14 +0000 (15:41 -0800)]
i965: Don't try to use HW blitter for glCopyPixels() when multisampled.
The hardware blitter doesn't understand multisampled layouts, so
there's no way this could possibly succeed.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Wed, 4 Dec 2013 05:15:47 +0000 (21:15 -0800)]
i965: Document conventions for counting layers in 2D multisample buffers.
The "layer" parameters used in blorp, and the
intel_renderbuffer::mt_layer field, represent a physical layer rather
than a logical layer. This is important for 2D multisample arrays on
Gen7+ because the UMS and CMS multisample layouts use N physical
layers to represent each logical layer, where N is the number of
samples.
Also add an assertion to blorp to help catch bugs if we fail to follow
these conventions.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Thu, 5 Dec 2013 12:34:42 +0000 (04:34 -0800)]
i965/blorp: Improve fast color clear comment.
Clarify the fact that we only optimize full buffer clears using fast
color clear, and why.
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tom Stellard [Tue, 3 Dec 2013 02:04:58 +0000 (21:04 -0500)]
r300/compiler/tests: Fix line length check in test parser
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
CC: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>
Tom Stellard [Tue, 3 Dec 2013 02:04:42 +0000 (21:04 -0500)]
r300/compiler/tests: Fix segfault
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
CC: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>
Ilia Mirkin [Sat, 7 Dec 2013 17:59:25 +0000 (12:59 -0500)]
nouveau/video: update a few more h264 picparm field names
Based on comments by Benjamin Morris <bmorris@nvidia.com> in
http://lists.freedesktop.org/archives/nouveau/2013-December/015328.html
This adds setting of is_long_term, and updates a few field names we were
unclear about.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Ilia Mirkin [Sat, 7 Dec 2013 17:00:47 +0000 (12:00 -0500)]
nouveau/video: update h264 picparm field names based on usage
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Ilia Mirkin [Sat, 7 Dec 2013 04:30:04 +0000 (23:30 -0500)]
nv50: enable h264 and mpeg4 for nv98+ (vp3, vp4.0)
Create the ref_bo without any storage type flags set for now. The issue
probably arises from our use of the additional buffer space at the end
of the ref_bo. It should probably be split up in the future.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Tested-by: Martin Peres <martin.peres@labri.fr>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Ilia Mirkin [Fri, 6 Dec 2013 02:43:07 +0000 (21:43 -0500)]
nvc0: make sure nvd7 gets NVC8_3D_CLASS as well
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Fri, 29 Nov 2013 06:18:57 +0000 (01:18 -0500)]
nv50: TXF already has integer arguments, don't try to convert from f32
Fixes the texelFetch piglit tests
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>