José Fonseca [Tue, 7 Jan 2014 13:42:23 +0000 (13:42 +0000)]
cso_context: Fix cso_context::sample_mask initial value.
The initial value of cso_context::sample_mask_saved is irrelevant as it
will be overwritten with cso_context::sample_mask in
cso_save_sample_mask. Therefore it is cso_context::sample_mask that
needs to be properly initialized.
This fixes regressions in blits and mipmap generation after adding
support for sample_mask to llvmpipe.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Si Chen [Wed, 18 Dec 2013 10:17:55 +0000 (02:17 -0800)]
llvmpipe: Implement alpha_to_coverage for non-MSAA framebuffers.
Implement Alpha to Coverage by discarding a fragment alpha component is
less than 0.5. This is a joint work of Jose and Si.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Andreas Fänger [Tue, 7 Jan 2014 10:10:00 +0000 (03:10 -0700)]
swrast: fix delayed texel buffer allocation regression for OpenMP
Commit
9119269ca14ed42b51c7d8e2e662500311b29fa3 moved the texel
buffer allocation to _swrast_texture_span(), however, when compiled
with OpenMP support this code already runs multi-threaded so a
critical section is required to prevent multiple allocations and
rendering errors.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Dave Airlie [Mon, 6 Jan 2014 22:51:10 +0000 (08:51 +1000)]
gallium/draw: remove double semicolon
code cleanup.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Brian Paul [Mon, 6 Jan 2014 23:11:21 +0000 (16:11 -0700)]
glsl: rename min(), max() functions to fix MSVC build
Evidently, there's some other definition of "min" and "max" that
causes MSVC to choke on these function names. Renaming to min2()
and max2() fixes things.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Thu, 12 Dec 2013 05:53:27 +0000 (21:53 -0800)]
i965: Remove unused PIPE_CONTROL defines.
Both brw_defines.h and intel_reg.h defined PIPE_CONTROL fields, which
had similar names, but couldn't be used in the same way. (One had
built-in shifts, and the other didn't...)
Delete the unused set to preserve sanity.
(Eric wrote an almost identical patch back in August, so I believe he
approves.)
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Vinson Lee [Mon, 6 Jan 2014 20:09:29 +0000 (12:09 -0800)]
mesa: Remove GLXContextID typedef from glxext.h.
This patch fixes this build error with gcc <= 4.5 and clang <= 3.1.
CC clientattrib.lo
In file included from ../../include/GL/glx.h:333:0,
from glxclient.h:45,
from clientattrib.c:32:
../../include/GL/glxext.h:275:13: error: redefinition of typedef 'GLXContextID'
../../include/GL/glx.h:171:13: note: previous declaration of 'GLXContextID' was here
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70591
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Maxence Le Doré [Thu, 2 Jan 2014 23:09:55 +0000 (00:09 +0100)]
docs/relnotes/10.1.html: report AMD_shader_trinary_minmax support
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Maxence Le Doré [Thu, 2 Jan 2014 23:09:54 +0000 (00:09 +0100)]
mesa: enable AMD_shader_trinary_minmax
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Maxence Le Doré [Thu, 2 Jan 2014 23:09:53 +0000 (00:09 +0100)]
glsl: implement mid3 built-in function
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Maxence Le Doré [Thu, 2 Jan 2014 23:09:52 +0000 (00:09 +0100)]
glsl: implement max3 built-in function
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Maxence Le Doré [Thu, 2 Jan 2014 23:09:51 +0000 (00:09 +0100)]
glsl: Implement min3 built-in function
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Maxence Le Doré [Thu, 2 Jan 2014 23:09:50 +0000 (00:09 +0100)]
glsl: add min() and max() functions to builder.cpp
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Maxence Le Doré [Thu, 2 Jan 2014 23:09:49 +0000 (00:09 +0100)]
glsl: add a shader_trinary_minmax predicate
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Maxence Le Doré [Thu, 2 Jan 2014 23:09:48 +0000 (00:09 +0100)]
glsl: Add extension tracking for AMD_shader_trinary_minmax
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Alexander von Gluck IV [Tue, 31 Dec 2013 21:39:49 +0000 (15:39 -0600)]
haiku libGL: Move from gallium target to src/hgl
* The Haiku renderers need to link to libGL to function properly
in all usage contexts. As mesa drivers build before gallium
targets, we couldn't properly link the mesa swrast driver to
the gallium libGL target for Haiku.
* This is likely better as it mimics how glx is laid out ensuring
the Haiku libGL is better understood.
* All renderers properly link in libGL now.
Acked-by: Brian Paul <brianp@vmware.com>
Alexander von Gluck IV [Tue, 31 Dec 2013 05:49:06 +0000 (23:49 -0600)]
haiku: Fix missing HaikuGL header paths
Acked-by: Brian Paul <brianp@vmware.com>
Brian Paul [Mon, 6 Jan 2014 19:50:43 +0000 (12:50 -0700)]
mesa: implement missing glGet(GL_RGBA_SIGNED_COMPONENTS_EXT) query
This is part of the GL_EXT_packed_float extension.
Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=73096
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Eric Anholt [Tue, 24 Dec 2013 00:08:53 +0000 (16:08 -0800)]
i965: Warning fix
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Mon, 6 Jan 2014 18:51:20 +0000 (10:51 -0800)]
i965: Delete unused INTEL_WRITE_{PART,FULL} and INTEL_READ #defines.
These are just software flag values (not hardware specific values), and
aren't used anywhere. Delete them to avoid confusion.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Marek Olšák [Sun, 5 Jan 2014 11:48:30 +0000 (12:48 +0100)]
radeonsi: calculate NUM_BANKS for DB correctly on CIK
NUM_BANKS is not constant on CIK.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Marek Olšák [Fri, 27 Dec 2013 18:17:47 +0000 (19:17 +0100)]
radeonsi: set correct pipe config for Hawaii in DB
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Fri, 27 Dec 2013 18:14:55 +0000 (19:14 +0100)]
radeonsi: disable HTILE for 1D-tiled depth-stencil buffers
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Juha-Pekka Heikkila [Fri, 3 Jan 2014 12:57:00 +0000 (05:57 -0700)]
glx: check memory allocations in __glXInitVertexArrayState()
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Juha-Pekka Heikkila [Fri, 3 Jan 2014 12:57:00 +0000 (05:57 -0700)]
glx: Add missing null check in __glXNewIndirectAPI()
Add extra null check in auto generated indirect_init.c via
src/mapi/glapi/gen/glX_proto_send.py
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Nathan Kidd [Fri, 3 Jan 2014 23:44:00 +0000 (16:44 -0700)]
docs: fix misspellings
Fixed what I noticed; no warranty for exhaustiveness.
Signed-off-by: Nathan Kidd <nkidd@opentext.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Chris Forbes [Sat, 4 Jan 2014 02:27:54 +0000 (15:27 +1300)]
i965: set size of txf_mcs payload vgrf properly
Previously we left the size of this vgrf as 1, which caused register
allocation to be subtly broken. If we were lucky we would explode in
the post-alloc instruction scheduler; if we were unlucky we'd just stomp
on someone else and get broken rendering.
Fixes crash when running `tesseract` with the following settings:
msaa 4
glineardepth 0
Also fixes the piglit test:
arb_sample_shading-builtin-gl-sample-id
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: Anuj Phogat <anuj.phogat@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72859
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Erik Faye-Lund [Tue, 17 Dec 2013 15:37:33 +0000 (16:37 +0100)]
glcpp: error on multiple #else/#elif directives
The preprocessor currently accepts multiple else/elif-groups
per if-section. The GLSL-preprocessor is defined by the C++
specification, which defines the following parse-rule:
if-section:
if-group elif-groups(opt) else-group(opt) endif-line
This clearly only allows a single else-group, that has to come
after any elif-groups.
So let's modify the code to follow the specification. Add test
to prevent regressions.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Carl Worth <cworth@cworth.org>
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
Carl Worth [Fri, 20 Dec 2013 00:06:31 +0000 (16:06 -0800)]
glcpp: Replace multi-line comment with a space (even as part of macro definition)
The preprocessor has always replaced multi-line comments with a single space
character, (as required by the specification), but as of commit
bd55ba568b301d0f764cd1ca015e84e1ae932c8b the lexer also emitted a NEWLINE
token for each newline within the comment, (in order to preserve line
numbers).
The emitting of NEWLINE tokens within the comment broke the rule of "replace a
multi-line comment with a single space" as could be exposed by code like the
following:
#define FOO a/*
*/b
FOO
Prior to commit
bd55ba568b301d0f764cd1ca015e84e1ae932c8b, this code defined
the macro FOO as "a b" as desired. Since that commit, this code instead
defines FOO as "a" and leaves a stray "b" in the output.
In this commit, we fix this by not emitting the NEWLINE tokens while lexing
the comment, but instead merely counting them in the commented_newlines
variable. Then, when the lexer next encounters a non-commented newline it
switches to a NEWLINE_CATCHUP state to emit as many NEWLINE tokens as
necessary (so that subsequent parsing stages still generate correct line
numbers).
Of course, it would have been more clear if we could have written a loop to
emit all the newlines, but flex conventions prevent that, (we must use
"return" for each token we emit).
It similarly would have been clear to have a new rule restricted to the
<NEWLINE_CATCHUP> state with an action much like the body of this if
condition. The problem with that is that this rule must not consume any
characters. It might be possible to write a rule that matches a single
lookahead of any character, but then we would also need an additional rule to
ensure for the <EOF> case where there are no additional characters available
for the lookahead to match.
Given those considerations, and given that the SKIP-state manipulation already
involves a code block at the top of the lexer function, before any rules, it
seems best to me to go with the implementation here which adds a similar
pre-rule code block for the NEWLINE_CATCHUP.
Finally, this commit also changes the expected output of a few, existing glcpp
tests. The change here is that the space character resulting from the
multi-line comment is now emitted before the newlines corresponding to that
comment. (Previously, the newlines were emitted first, and the space character
afterward.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72686
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Carl Worth [Thu, 19 Dec 2013 23:14:19 +0000 (15:14 -0800)]
glcpp: Add a more descriptive comment for the SKIP state manipulation
Two things make this code confusing:
1. The uncharacteristic manipulation of lexer start state outside of
flex rules.
2. The confusing semantics of the skip_stack (including the
"lexing_if" override and the SKIP_NO_SKIP state).
This new comment is intended to bring a bit more clarity for any readers.
There is no intended beahvioral change to the code here. The actual code
changes include better indentation to avoid an excessively-long line, and
using the more descriptive INITIAL rather than 0.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Courtney Goeltzenleuchter [Fri, 13 Dec 2013 19:12:53 +0000 (12:12 -0700)]
i965: Enhance intel_texsubimage_tiled_memcpy() to support all levels
Support all levels of a supported texture format.
Using 1024x1024, RGBA 8888 source, mipmap
internal-format Before (MB/sec) mipmap (MB/sec)
GL_RGBA 627.15 615.90
GL_RGB 456.35 611.53
512x512
GL_RGBA 597.00 619.95
GL_RGB 440.62 611.28
256x256
GL_RGBA 487.80 587.42
GL_RGB 376.63 585.00
Benchmark has been sent to mesa-dev list: teximage_enh
Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Courtney Goeltzenleuchter [Fri, 13 Dec 2013 19:12:52 +0000 (12:12 -0700)]
i965: Add XRGB to intel_texsubimage_tiled_memcpy()
MESA_FORMAT_XRGB8888 is equivalent to MESA_FORMAT_ARGB8888 in terms
of storage on the device, so okay to use this optimized copy routine.
This series builds on work from Frank Henigman to optimize the
process of uploading a texture to the GPU. This series adds support for
MESA_XRGB_8888 and full miptrees where were found to be common activities
in the Smokin' Guns game. The issue was found while profiling the app
but that part is not benchmarked. Smokin-Guns uses mipmap textures with
an internal format of GL_RGB (MESA_XRGB_8888 in the driver).
These changes need a performance tool to run against to show how they
improve execution performance for specific texture formats. Using this
benchmark I've measured the following improvement on my Ivybridge
Intel(R) Xeon(R) CPU E3-1225 V2 @ 3.20GHz.
1024x1024 texture size
internal-format Before (MB/sec) XRGB (MB/sec)
GL_RGBA 628.15 627.15
GL_RGB 265.95 456.35
512x512 texture size
internal-format Before (MB/sec) XRGB (MB/sec)
GL_RGBA 600.23 597.00
GL_RGB 255.50 440.62
256x256 texture size
internal-format Before (MB/sec) XRGB (MB/sec)
GL_RGBA 489.08 487.80
GL_RGB 229.03 376.63
Benchmark has been sent to mesa-dev list: teximage
Signed-off-by: Courtney Goeltzenleuchter <courtney@LunarG.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Paul Berry [Mon, 16 Dec 2013 23:10:42 +0000 (15:10 -0800)]
glsl: Fix gl_type of usamplerCube built-in type.
I'm not aware of any piglit tests that this fixes, but the old code
was obviously wrong.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Tue, 17 Dec 2013 18:11:27 +0000 (10:11 -0800)]
mesa: Add an assertion to _mesa_program_index_to_target().
Only a Mesa bug could cause this function to be called with an
out-of-range index, so raise an assertion if that ever happens.
Reviewed-by: Brian Paul <brianp@vmware.com>
Paul Berry [Tue, 17 Dec 2013 17:54:38 +0000 (09:54 -0800)]
mesa: Improve static error checking of arrays sized by MESA_SHADER_TYPES.
This patch replaces the following pattern:
foo bar[MESA_SHADER_TYPES] = {
...
};
With:
foo bar[] = {
...
};
STATIC_ASSERT(Elements(bar) == MESA_SHADER_TYPES);
This way, when a new shader type is added in a future version of Mesa,
we will get a compile error to remind us that the array needs to be
updated.
Reviewed-by: Brian Paul <brianp@vmware.com>
Paul Berry [Tue, 17 Dec 2013 17:49:43 +0000 (09:49 -0800)]
glsl: Remove extraneous shader_type argument from analyze_clip_usage().
This argument was carrying the name of the shader target (as a
string). We can get this just as easily by calling
_mesa_shader_enum_to_string().
Reviewed-by: Brian Paul <brianp@vmware.com>
Paul Berry [Tue, 17 Dec 2013 17:46:08 +0000 (09:46 -0800)]
glsl: Get rid of hardcoded arrays of shader target names.
We already have a function for converting a shader type index to a
string: _mesa_shader_type_to_string().
Reviewed-by: Brian Paul <brianp@vmware.com>
Paul Berry [Tue, 17 Dec 2013 18:07:24 +0000 (10:07 -0800)]
main: Remove unused function _mesa_shader_index_to_type().
Reviewed-by: Brian Paul <brianp@vmware.com>
Paul Berry [Tue, 17 Dec 2013 20:13:11 +0000 (12:13 -0800)]
Rename overloads of _mesa_glsl_shader_target_name().
Previously, _mesa_glsl_shader_target_name() had an overload for GLenum
and an overload for the gl_shader_type enum, each of which behaved
differently. However, since GLenum is a synonym for unsigned int, and
unsigned ints are often used in place of gl_shader_type (e.g. in loop
indices), there was a big risk of calling the wrong overload by
mistake. This patch gives the two overloads different names so that
it's always clear which one we mean to call.
Reviewed-by: Brian Paul <brianp@vmware.com>
Kenneth Graunke [Mon, 30 Dec 2013 07:19:36 +0000 (23:19 -0800)]
Revert "mesa: Remove GLXContextID typedef from glx.h."
This reverts commit
136a12ac98868d82c2ae9fcc80d11044a7ec56d1.
According to belak51 on IRC, this commit broke Allegro, which would no
longer compile. Applications apparently expect the GLXContextID typedef
to exist in glx.h; removing it breaks them. A bit of searching around
the internet revealed other complaints since upgrading to Mesa 10.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Kenneth Graunke [Wed, 27 Nov 2013 06:22:13 +0000 (22:22 -0800)]
i965: Remove unused depth_mode parameter from translate_tex_format().
According to git blame, this hasn't been used in over two years:
commit
d2235b0f4681f75d562131d655a6d7b7033d2d8b
Author: Eric Anholt <eric@anholt.net>
Date: Thu Nov 17 17:01:58 2011 -0800
i965: Always handle GL_DEPTH_TEXTURE_MODE through the shader.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Topi Pohjolainen [Thu, 19 Dec 2013 10:16:53 +0000 (12:16 +0200)]
i965/blorp: unit test compiling integer typed texture fetches
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Topi Pohjolainen [Tue, 10 Dec 2013 20:55:28 +0000 (22:55 +0200)]
i965/blorp: unit test compiling simple gen6 zero-src sampled
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Topi Pohjolainen [Sun, 8 Dec 2013 17:31:20 +0000 (19:31 +0200)]
i965/blorp: unit test compiling gen6 msaa-8 cms alpha blend
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Topi Pohjolainen [Sun, 8 Dec 2013 17:26:47 +0000 (19:26 +0200)]
i965/blorp: unit test compiling bilinear filtered
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Topi Pohjolainen [Sun, 8 Dec 2013 11:19:23 +0000 (13:19 +0200)]
i965/blorp: unit test compiling simple zero-src sampled
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Topi Pohjolainen [Sun, 8 Dec 2013 11:06:35 +0000 (13:06 +0200)]
i965/blorp: unit test compiling unaligned msaa-8
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Topi Pohjolainen [Sun, 8 Dec 2013 10:37:53 +0000 (12:37 +0200)]
i965/blorp: unit test compiling msaa-8 cms alpha blend
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Topi Pohjolainen [Tue, 10 Dec 2013 20:36:33 +0000 (22:36 +0200)]
i965/blorp: unit test compiling msaa-4 ums to cms
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Topi Pohjolainen [Tue, 10 Dec 2013 20:24:44 +0000 (22:24 +0200)]
i965/blorp: unit test compiling msaa-8 cms to cms
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Topi Pohjolainen [Sun, 8 Dec 2013 09:25:55 +0000 (11:25 +0200)]
i965/blorp: unit test compiling msaa-8 ums to cms
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Topi Pohjolainen [Thu, 5 Dec 2013 17:16:02 +0000 (19:16 +0200)]
i965/blorp: unit test compiling blend and scaled
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Topi Pohjolainen [Thu, 5 Dec 2013 15:59:29 +0000 (17:59 +0200)]
i965/blorp: allow unit tests to compile and dump assembly
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Topi Pohjolainen [Thu, 5 Dec 2013 15:34:56 +0000 (17:34 +0200)]
i965: dump the disassembly to the given file
instead of ignoring the argument and always dumping to
standard output.
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Topi Pohjolainen [Wed, 27 Nov 2013 14:21:11 +0000 (16:21 +0200)]
i965/fs: allow fs-generator use without gl_fragment_program
Prepares the generator to accept hand-crafted blorp programs.
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Topi Pohjolainen [Wed, 27 Nov 2013 12:32:41 +0000 (14:32 +0200)]
i965/fs: generate fs programs also without any 8-width instructions
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Rob Clark [Mon, 23 Dec 2013 14:59:42 +0000 (09:59 -0500)]
freedreno/a3xx: fix blend state corruption issue
Using RMW on banked context registers is not safe. The value read
could be the wrong one. So if there has been a DRAW_IDX launched,
the RMW must be preceded by a WAIT_FOR_IDLE to ensure the read part
of RMW sees the correct value.
To avoid unnecessary WFI's, keep track if there is a need for WFI,
and only emit one if needed. Furthermore, keep track if we even
need to update the register in the first place.
And to cut down on the amount of RMW to avoid excessive WFI's, at the
tiling/GMEM level we can always overwrite RB_RENDER_CONTROL, as the
state at beginning of draw/clear cmds (which we IB to) is always
undefined. In the draw/clear commands, we always still use RMW (with
WFI if needed), but only if the register value actually changes. (At
points where the current value cannot be known, the saved value is
reset to ~0, which includes bits outside of RBRC_DRAW_STATE, so there
never is chance for confusion.)
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Sat, 21 Dec 2013 01:48:18 +0000 (20:48 -0500)]
freedreno: prepare for hw binning
Actually assign VSC_PIPE's properly, which will be needed for tiling.
And introduce fd_tile for per-tile state (including the assignment of
tile to VSC_PIPE). This gives us the proper pipe setup that we'll
need for hw binning pass, and also cleans things up a bit by not having
to pass so many parameters around. And will also make it easier to
introduce different tiling patterns (since we may no longer render
tiles in a simple left-to-right top-to-bottom pattern).
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Fri, 20 Dec 2013 23:08:54 +0000 (18:08 -0500)]
freedreno: resync generated headers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Alex Deucher [Tue, 24 Dec 2013 20:22:31 +0000 (15:22 -0500)]
r600g: fix SUMO2 pci id
0x9649 is sumo2, not sumo.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
CC: "9.2" "10.0" <mesa-stable@lists.freedesktop.org>
Vinson Lee [Thu, 19 Dec 2013 23:55:28 +0000 (15:55 -0800)]
scons: Add system library linker flags on LLVM 3.5.
llvn-3.5svn r197664 split out the linker flags from ldflags to
system-libs.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Aaron Watry [Fri, 15 Nov 2013 22:09:41 +0000 (16:09 -0600)]
r600/pipe: Stop leaking context->start_compute_cs_cmd.buf on EG/CM
Found while tracking down memory leaks in VDPAU playback
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Fri, 15 Nov 2013 22:07:31 +0000 (16:07 -0600)]
st/vdpau: Destroy context when initialization fails
Prevents a potential memory leak found when tracking down something else.
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Fri, 8 Nov 2013 19:59:59 +0000 (13:59 -0600)]
radeon/llvm: Free target data at end of optimization
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Fri, 8 Nov 2013 19:53:10 +0000 (13:53 -0600)]
r600/compute: Use the correct FREE macro when deleting compute state
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Thu, 12 Dec 2013 22:35:54 +0000 (16:35 -0600)]
r600/compute: Free compiled kernels when deleting compute state
v2: Remove unnecessary null pointer check
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Thu, 12 Dec 2013 22:34:09 +0000 (16:34 -0600)]
radeon/compute: Stop leaking LLVMContexts in radeon_llvm_parse_bitcode
Previously we were creating a new LLVMContext every time that we called
radeon_llvm_parse_bitcode, which caused us to leak the context every time
that we compiled a CL program.
Sadly, we can't dispose of the LLVMContext at the point that it was being
created because evergreen_launch_grid (and possibly the SI equivalent) was
assuming that the context used to compile the kernels was still available.
Now, we'll create a new LLVMContext when creating EG/SI compute state, store
it there, and pass it to all of the places that need it.
The LLVM Context gets destroyed when we delete the EG/SI compute state.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Fri, 8 Nov 2013 19:45:05 +0000 (13:45 -0600)]
pipe_loader/sw: close dev->lib when initialization fails
Prevents a memory leak.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Aaron Watry [Fri, 8 Nov 2013 16:15:44 +0000 (10:15 -0600)]
clover: Remove unused variable
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
CC: "10.0" <mesa-stable@lists.freedesktop.org>
Jonathan Liu [Mon, 16 Dec 2013 01:24:00 +0000 (18:24 -0700)]
llvmpipe: use pipe_sampler_view_release() to avoid segfault
This fixes another case of faulting when freeing a pipe_sampler_view
that belongs to a previously destroyed context.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jonathan Liu <net147@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Jonathan Liu [Sat, 14 Dec 2013 14:15:00 +0000 (07:15 -0700)]
st/mesa: use pipe_sampler_view_release()
This fixes a crash where old_view->context was already freed in the
pipe_sampler_view_reference function contained in
src/gallium/auxiliary/utils/u_inlines.h. As a result, the
sampler_view_destroy function pointer contained 0xfeeefeee indicating
freed heap memory.
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Jonathan Liu <net147@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Henri Verbeet [Sun, 15 Dec 2013 11:23:38 +0000 (12:23 +0100)]
i915: Add support for gl_FragData[0] reads.
Similar to
556a47a2621073185be83a0a721a8ba93392bedb, without this reading from
gl_FragData[0] would cause a software fallback.
Bugzilla: https://bugs.winehq.org/show_bug.cgi?id=33964
Signed-off-by: Henri Verbeet <hverbeet@gmail.com>
Cc: 10.0 9.2 9.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Andreas Hartmetz [Sat, 21 Dec 2013 20:11:37 +0000 (21:11 +0100)]
radeonsi: Use htile_buffer for depth only when there is no stencil.
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Niels Ole Salscheider [Wed, 18 Dec 2013 18:11:44 +0000 (19:11 +0100)]
winsys/radeon: remove superfluous distinction of cases
Signed-off-by: Niels Ole Salscheider <niels_ole@salscheider-online.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Mark Mueller [Sat, 21 Dec 2013 03:14:08 +0000 (19:14 -0800)]
mesa: inline r200 radeon texture format macros to facility search and replace
Signed-off-by: Mark Mueller <MarkKMueller@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Lauri Kasanen [Thu, 19 Dec 2013 19:43:25 +0000 (21:43 +0200)]
mesa: Fix build to properly check for supported compiler flags
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72708
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Lauri Kasanen <cand@gmx.com>
Ian Romanick [Wed, 13 Nov 2013 22:27:11 +0000 (14:27 -0800)]
mesa: It is not possible to have GLSL < 1.20
This hasn't been possible for a long time.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Ian Romanick [Wed, 13 Nov 2013 22:15:11 +0000 (14:15 -0800)]
mesa: Clean up bad code formatting left from previous commit
Also s/_EXT// on enums that are now part of core.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Ian Romanick [Wed, 13 Nov 2013 22:10:34 +0000 (14:10 -0800)]
mesa: GL_EXT_packed_depth_stencil is not optional
Every driver supports it. All current and future Gallium drivers always
support it, and all existing classic drivers support it.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Ian Romanick [Wed, 13 Nov 2013 21:24:14 +0000 (13:24 -0800)]
radeon: Sort list of enabled extensions
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Ian Romanick [Wed, 13 Nov 2013 21:17:28 +0000 (13:17 -0800)]
r200: Sort list of enabled extensions
Note that ARB_occlusion_query was previously enabled twice.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Lauri Kasanen [Sun, 15 Dec 2013 10:37:55 +0000 (12:37 +0200)]
glx: Simplify __glxGetMscRate, it only needs the screen, not a drawable
Useful in its own right, but also needed for adaptive vsync.
No regressions in the piglit glx-oml-sync-control-getmscrate test.
Signed-off-by: Lauri Kasanen <cand@gmx.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
Keith Packard [Sun, 24 Nov 2013 05:58:14 +0000 (21:58 -0800)]
dri3: Rename DRI3_MAX_BACK to DRI3_NUM_BACK
It is the maximum number of back buffers, but the name is confusing and is
easily read as the maximum back buffer index. Chage to DRI3_NUM_BACK to make
the intended usage a bit clearer.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Keith Packard [Fri, 22 Nov 2013 13:41:38 +0000 (05:41 -0800)]
i965: Set fast color clear mcs_state on newly allocated image miptrees
Just copying code from the dri2 path to set up the fast color clear state.
This also removes a couple of bogus intel_region_reference calls.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Keith Packard [Fri, 22 Nov 2013 13:39:15 +0000 (05:39 -0800)]
i965: Correct check for re-bound buffer in intel_update_image_buffer
The buffer-object is the persistent thing passed through the loader, so when
updating an image buffer, check to see if it is already bound to the provided
bo. The region, on the other hand, is allocated separately for the miptree,
and so will never be the same as that passed back from the loader.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Keith Packard [Tue, 26 Nov 2013 05:10:02 +0000 (21:10 -0800)]
dri3: Clean up struct dri3_drawable
Move the depth field up with width and height.
Remove unused previous_time and frames fields.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Keith Packard [Fri, 22 Nov 2013 05:30:07 +0000 (21:30 -0800)]
dri3: Free resources when drawable is destroyed.
Always nice to clean up after ourselves.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Keith Packard [Fri, 22 Nov 2013 04:22:16 +0000 (20:22 -0800)]
dri3: Switch to libxshmfence version 1.1
libxshmfence v1.0 foolishly used 'int32_t *' for the fence type, which
works when the fence is a linux futex. However, version 1.1
changes the exported datatype to 'struct xshmfence *'
Require libxshmfence version 1.1 and switch the API around.
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Wed, 27 Nov 2013 07:15:39 +0000 (23:15 -0800)]
i965: Use RED for depth texture formats rather than INTENSITY.
While looking through the documentation, I found this in the Sandybridge
PRM (Volume 4, Part 1, Page 140):
"Use of sample_c with SURFTYPE_CUBE surfaces is undefined with the
following surface formats: I24X8_UNORM, L24X8_UNORM, A24X8_UNORM,
I32_FLOAT, L32_FLOAT, A32_FLOAT."
I haven't observed this to be true, but it suggests that we may want to
use other formats.
We already perform DEPTH_TEXTURE_MODE swizzling in the shaders, and
don't rely on the surface format to splat things appropriately. So
using RED should work just as well as INTENSITY.
A few notes about the formats:
- R24_UNORM_X8_TYPELESS has the exact same properties as I24X8_UNORM.
- R16_UNORM and R32_FLOAT are additionally supported as a render target,
while the old I16_UNORM/I32_FLOAT formats are not.
- R32_FLOAT_X8X24_TYPELESS is not supported as a render target, while
the old format (R32G32_FLOAT) was. However, it shares the same
properties as the formats we use for Z24, so it should suffice.
This makes translate_tex_format and brw_blorp_surface_info::set
a bit more similar.
No Piglit changes on Sandybridge or Ivybridge. No oglconform changes on
Sandybridge.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Chad Versace [Fri, 20 Dec 2013 12:39:03 +0000 (04:39 -0800)]
i965/gen6: Fix HiZ hang in WebGL Google Maps
Emitting flushes before depth and hiz resolves at the top of blorp's
state emission fixes the hang. Marchesin and I found the fix
experimentally, as opposed to adhering to a documented hardware
workaround. A more minimal fix likely exists, but this gets the job
done.
Fixes HiZ hangs in the new WebGL Google maps on Sandybridge Chrome OS.
Tested by zooming in and out continuously for 2 hours.
This patch is based on
https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/
8bc07bb70163c3706fb4ba5f980e57dc942f56dd
CC: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70740
Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Sat, 14 Dec 2013 00:10:02 +0000 (16:10 -0800)]
i965: Store QPitch in intel_mipmap_tree.
Broadwell allows us to specify an arbitrary value for QPitch, rather
than baking a specific formula into the hardware and requiring software
to lay things out to match. The only restriction is that the software
provided QPitch needs to be large enough so successive array slices do
not overlap.
In order to support this flexibility, software needs to specify QPitch
in a bunch of packets. Storing QPitch makes that easy, and allows us to
adjust it in a single place should we wish to change it in the future.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Kenneth Graunke [Tue, 10 Dec 2013 09:53:26 +0000 (01:53 -0800)]
i965: Add support for Broadwell's new register types.
Broadwell introduces support for Q, UQ, and HF types. It also extends
DF support to allow immediate values.
Irritatingly, although HF and DF both support immediates, they're
represented by a different value depending on the register file.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Tue, 10 Dec 2013 09:49:18 +0000 (01:49 -0800)]
i965: Add BRW_REGISTER_TYPE_DF.
Ivybridge, Baytrail, and Haswell support double float register types,
but do not support them as immediate values.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Tue, 10 Dec 2013 08:33:56 +0000 (00:33 -0800)]
i965: Abstract BRW_REGISTER_TYPE_* into an enum with unique values.
On released hardware, values 4-6 are overloaded. For normal registers,
they mean UB/B/DF. But for immediates, they mean UV/VF/V.
Previously, we just created #defines for each name, reusing the same
value. This meant we could directly splat the brw_reg::type field into
the assembly encoding, which was fairly nice, and worked well.
Unfortunately, Broadwell makes this infeasible: the HF and DF types are
represented as different numeric values depending on whether the
source register is an immediate or not.
To preserve sanity, I decided to simply convert BRW_REGISTER_TYPE_* to
an abstract enum that has a unique value for each register type, and
write translation functions. One nice benefit is that we can add
assertions about register files and generations.
I've chosen not to convert brw_reg::type to the enum, since converting
it caused a lot of trouble due to C++ enum rules (even though it's
defined in an extern "C" block...).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Tue, 10 Dec 2013 09:36:37 +0000 (01:36 -0800)]
i965: Decode three-source register types directly.
Three-source instructions use a different encoding for register types
(and have a much more limited set to choose from).
Previously, we translated those into BRW_REGISTER_TYPE_* values, then
reused the existing reg_encoding mapping.
Doing it directly is more straightforward and actually less code.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Tue, 10 Dec 2013 09:21:54 +0000 (01:21 -0800)]
i965: Disassemble UV types, not UB types.
UB types have never been supported as immediates. On Gen4-5, register
encoding 4 is "Reserved." On Gen6+, it means UV.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Tue, 10 Dec 2013 09:18:34 +0000 (01:18 -0800)]
i965: Add missing BRW_REGISTER_TYPE_UV.
Sandybridge added support for packed unsigned vectors.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Wed, 18 Dec 2013 20:30:06 +0000 (12:30 -0800)]
i965: Fix 3DSTATE_PUSH_CONSTANT_ALLOC_PS packet creation.
When adding geometry shader support, we accidentally reversed the size
and offset parameters.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Kenneth Graunke [Tue, 10 Dec 2013 00:06:51 +0000 (16:06 -0800)]
i965: Use {point_sprite,flat}_enable variable names instead of dw*.
Calling the local variables flat_enable and point_sprite_enable is
clearer than dw16 and such. It also matches the names used in
calculate_attr_overrides, which computes them.
v2: Add /* dw16 */ and /* dw10 */ comments, requested by Jordan.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Kenneth Graunke [Mon, 9 Dec 2013 23:58:35 +0000 (15:58 -0800)]
i965: Zero out {point_sprite,flat}_enables in calculate_attr_overrides.
calculate_attr_overrides is responsible for computing the point sprite
and flat-shading enable bitfields. It does so by OR'ing in a bunch of
bits. However, it relied on the caller to set the initial value to
zero. This is pretty fragile - if the caller neglects to zero out those
variables, then the enable bitfields end up full of garbage, which shows
up as random things being flat-shaded.
This patch moves the zero-initialization into calculate_attr_overrides,
so that the computation is completely in one place.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>