Ian Romanick [Mon, 20 Oct 2014 22:35:46 +0000 (15:35 -0700)]
glsl_to_tgsi: Remove st_new_shader
It was identical to the default implementation in _mesa_new_shader.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Ian Romanick [Mon, 20 Oct 2014 22:30:30 +0000 (15:30 -0700)]
glsl_to_tgsi: Remove st_new_shader_program
It was identical to the default implementation in
_mesa_new_shader_program.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Ian Romanick [Mon, 20 Oct 2014 22:26:42 +0000 (15:26 -0700)]
i965: Remove brw_new_shader_program
It was identical to the default implementation in
_mesa_new_shader_program.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Ian Romanick [Mon, 20 Oct 2014 21:50:55 +0000 (14:50 -0700)]
mesa: Silence unused parameter warning in _mesa_clear_shader_program_data
Just remove the parameter.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Ian Romanick [Mon, 20 Oct 2014 21:40:34 +0000 (14:40 -0700)]
linker: Rely on _mesa_clear_shader_program_data to clear link information
_mesa_link_shader_program already calls _mesa_clear_shader_program_data
before calling link_shaders, so this is already done.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ian Romanick [Mon, 20 Oct 2014 21:35:01 +0000 (14:35 -0700)]
mesa: Add some missing clean-up to _mesa_clear_shader_program_data
All of this is already done in link_shaders. More clean-ups coming.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Ian Romanick [Tue, 21 Oct 2014 00:43:08 +0000 (17:43 -0700)]
mesa: Remove prototypes for nonexistent functions
_mesa_UseShaderProgramEXT, _mesa_ActiveProgramEXT, and
_mesa_CreateShaderProgramEXT were all removed when support for
GL_EXT_separate_shader_objects was removed.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Ian Romanick [Thu, 23 Oct 2014 16:20:26 +0000 (09:20 -0700)]
ff_fragment_shader: Silence unused parameter warning in smear
Just remove the parameter. Silences:
../../src/mesa/main/ff_fragment_shader.cpp:668:1: warning: unused parameter 'p' [-Wunused-parameter]
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Ian Romanick [Sat, 25 Oct 2014 00:59:05 +0000 (17:59 -0700)]
meta: Only use _mesa_ClipControl if the extension is supported
Fixes many piglit failures on IVB since
85edaa8.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85425
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Cc: Mathias Fröhlich <Mathias.Froehlich@gmx.net>
Emil Velikov [Sat, 25 Oct 2014 01:13:11 +0000 (01:13 +0000)]
docs: add news item and link release notes
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Sat, 25 Oct 2014 00:43:12 +0000 (00:43 +0000)]
docs: Add sha256 sums for the 10.3.2 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit
95994706429e08665d1d33d248c8bcd67d40251e)
Emil Velikov [Sat, 25 Oct 2014 00:33:38 +0000 (00:33 +0000)]
Add release notes for the 10.3.2 release
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
(cherry picked from commit
3b6a4758fa8958db4b76e6d7efccc93b12b1da06)
Jason Ekstrand [Sat, 4 Oct 2014 01:13:05 +0000 (18:13 -0700)]
i965/fs: Compute q-values for register allocation manually
Previously, we were allowing the register allocation code to do the
computation for us in ra_set_finalize. However, the runtime for this
computation is O(c^4 * g) where c is the number of classes and g is the
number of GRF registers. However, these q-values are directly computable
based on the way we lay out our register classes so there is no need for
the aweful runtime algorithm.
We were doing ok until commit
7210583eb where we bumped the number of
register classes from 11 to 16. While startup times don't normally matter,
this caused piglit to take 4 times as long to run on Bay Trail. This patch
should make generating the ra_set much faster and melt the piglit run
times.
v2: Fixed a couple of bugs. I have now verified that the same q-values are
generated both ways.
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Tue, 7 Oct 2014 04:27:06 +0000 (21:27 -0700)]
i965/fs: Don't interfere with too many base registers
On older GENs in SIMD16 mode, we were accidentally building too much
interference into our register classes. Since everything is divided by 2,
the reigster allocator thinks we have 64 base registers instead of 128.
The actual GRF mapping still needs to be doubled, but as far as the ra_set
is concerned, we only have 64. We were accidentally adding way too much
interference.
Signed-off-by: Jason Ekstrand <jason.ekstrand@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Tue, 14 Oct 2014 02:41:17 +0000 (19:41 -0700)]
i965/fs: Properly precolor payload registers on GEN5 in SIMD16
For GEN6 SIMD16 mode, we have to 2-align all the registers, so we only have
the even-numbered ones. This means that we have to divide the register
number by 2 when we precolor. This wasn't a problem before because we were
setting up the interference between ra_node registers wrong. This will be
fixed in the next commit.
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Sat, 4 Oct 2014 01:09:52 +0000 (18:09 -0700)]
i965/fs: Add another use of MAX_VGRF_SIZE
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Sat, 4 Oct 2014 01:08:12 +0000 (18:08 -0700)]
util: Use reg_belongs_to_class instead of BITSET_TEST
This shouldn't be a functional change since reg_belongs_to_class is just a
wrapper around BITSET_TEST. It just makes the code a little easier to
read.
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
José Fonseca [Fri, 24 Oct 2014 19:27:31 +0000 (20:27 +0100)]
llvmpipe: Ensure the packed input of the lp_test_format is aligned.
Fixes:
- https://bugs.freedesktop.org/show_bug.cgi?id=85377
- http://llvm.org/bugs/show_bug.cgi?id=21365
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
José Fonseca [Fri, 24 Oct 2014 18:54:28 +0000 (19:54 +0100)]
llvmpipe: Flush stdout on lp_test_* unit tests.
So that the order of test messages and gallivm/llvmpipe debug output is
preserved.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Mathias Fröhlich [Sun, 21 Sep 2014 16:09:22 +0000 (18:09 +0200)]
gallium: Enable ARB_clip_control for gallium drivers.
Gallium should be prepared fine for ARB_clip_control.
So enable this and mention it in the release notes.
v2:
Only enable for drivers announcing the freshly introduced
PIPE_CAP_CLIP_HALFZ capability.
v3:
Use extension enable infrastructure to connect PIPE_CAP_CLIP_HALFZ
with ARB_clip_control.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
Mathias Fröhlich [Sun, 14 Sep 2014 13:17:07 +0000 (15:17 +0200)]
gallium: introduce PIPE_CAP_CLIP_HALFZ.
In preparation of ARB_clip_control. Let the driver decide if
it supports pipe_rasterizer_state::clip_halfz being set to true.
v3:
Initially enable on ilo.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de
Mathias Fröhlich [Thu, 25 Sep 2014 17:39:31 +0000 (19:39 +0200)]
mesa: Handle clip control in meta operations.
Restore clip control to the default state if MESA_META_VIEWPORT
or MESA_META_DEPTH_TEST is requested.
v3:
Handle clip control state with MESA_META_TRANSFORM.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
Mathias Fröhlich [Sun, 21 Sep 2014 16:09:22 +0000 (18:09 +0200)]
mesa: Implement ARB_clip_control.
Implement the mesa parts of ARB_clip_control.
So far no driver enables this.
v3:
Restrict getting clip control state to the availability
of ARB_clip_control.
Move to transformation state.
Handle clip control state with the GL_TRANSFORM_BIT.
Move _FrontBit update into state.c.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
Mathias Fröhlich [Sun, 21 Sep 2014 16:09:21 +0000 (18:09 +0200)]
mesa: Refactor viewport transform computation.
This is for preparation of ARB_clip_control.
v3:
Add comments.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Froehlich <Mathias.Froehlich@web.de>
Eric Anholt [Fri, 24 Oct 2014 16:16:59 +0000 (17:16 +0100)]
vc4: Reuse uniform_data/contents indices when making uniforms.
This allows vc4_opt_cse.c to CSE-away operations involving the same
uniform values.
total instructions in shared programs: 37341 -> 36906 (-1.16%)
instructions in affected programs: 10233 -> 9798 (-4.25%)
total uniforms in shared programs: 10523 -> 10320 (-1.93%)
uniforms in affected programs: 2467 -> 2264 (-8.23%)
Eric Anholt [Fri, 24 Oct 2014 15:50:37 +0000 (16:50 +0100)]
vc4: When asked to discard-map a whole resource, discard it.
This saves a bunch of extra flushes when texsubimaging a whole texture
that's been used for rendering, or subdataing a whole BO. In particular,
this massively reduces the runtime of piglit texture-packed-formats (when
the probes have been moved out of the inner loop).
Eric Anholt [Fri, 24 Oct 2014 15:45:04 +0000 (16:45 +0100)]
vc4: Refactor flushing before mapping a BO.
I'm going to want to make some other decisions here before flushing.
Eric Anholt [Fri, 24 Oct 2014 14:03:04 +0000 (15:03 +0100)]
vc4: Allow dead code elimination of unused varyings.
total instructions in shared programs: 39022 -> 37341 (-4.31%)
instructions in affected programs: 26979 -> 25298 (-6.23%)
total uniforms in shared programs: 11242 -> 10523 (-6.40%)
uniforms in affected programs: 5836 -> 5117 (-12.32%)
Eric Anholt [Wed, 22 Oct 2014 17:02:18 +0000 (18:02 +0100)]
vc4: Add debug output to match shaderdb info to program dumps.
I'm going to be using VC4_DEBUG=shaderdb,norast to do shaderdb stats, but
when debugging regressions, I want to match shaderdb output to shader
disassembly.
Andreas Boll [Thu, 23 Oct 2014 12:52:55 +0000 (14:52 +0200)]
radeon: enable Hyper-Z on r600g and radeonsi by default
This reverts commit
01e637114914453451becc0dc8afe60faff48d84.
Since then many Hyper-Z issues have been fixed or worked around.
Enable Hyper-Z by default so that we get enough feedback for the upcoming
mesa 10.4 release.
If you have issues with Hyper-Z try to disable Hyper-Z using the enviroment
variable R600_DEBUG=nohyperz and please report the issue on the bugtracker.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75011
See also: https://bugs.freedesktop.org/show_bug.cgi?id=75112
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Matt Turner [Thu, 23 Oct 2014 22:45:35 +0000 (15:45 -0700)]
i965: Silence unused variable warning.
Matt Turner [Thu, 23 Oct 2014 22:45:15 +0000 (15:45 -0700)]
i965/fs: Silence uninitialized variable warning.
The compiler isn't privy to the knowledge that we're doing at least one
framebuffer write.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Matt Turner [Tue, 30 Sep 2014 23:24:39 +0000 (16:24 -0700)]
util: Add assume() macro.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Jan Vesely [Wed, 13 Aug 2014 20:47:28 +0000 (16:47 -0400)]
glapi: Fix compiler warning and script name
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Rob Clark [Thu, 23 Oct 2014 12:50:50 +0000 (08:50 -0400)]
Revert "freedreno/a3xx: only emit dirty consts"
This reverts commit
94bb33617d1e8978dc52b8aaa4eb41bfb6703f79.
Which somehow broke gnome-shell.. and needs more investigation. For
now, revert..
Rob Clark [Wed, 22 Oct 2014 20:36:24 +0000 (16:36 -0400)]
freedreno: fix PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE
fd_bo_cpu_prep() doesn't realize the bo is already referenced in
unflushed cmdstream. It could be made to do so (but would have to be
implemented twice, ie. both for msm and kgsl). But we still can't do
the expected thing if the caller isn't using _NOSYNC. Because of the
way the tiling works, we need to build quite a bit of cmdstream at flush
time, which is not possible to do at the libdrm level.
So rather than trying to make fd_bo_cpu_prep() smarter than it can
possibly be, just *always* discard and reallocate if the
PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE flag is set.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Jan Vesely [Tue, 21 Oct 2014 15:59:34 +0000 (11:59 -0400)]
clover: Require libelf
v2: test for libelf once, check in both radeon and clover
CC: Tom Stellard <tom@stellard.net>
CC: Emil Velikov <emil.l.velikov@gmail.com>
CC: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Mon, 20 Oct 2014 08:53:39 +0000 (09:53 +0100)]
clover: use correct typenames for compat::pair's first/second
Seems to be a typo judging from the overall declaration of the
template.
Cc: EdB <edb+mesa@sigluy.net>
Cc: Francisco Jerez <currojerez@riseup.net>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Emil Velikov [Sun, 19 Oct 2014 15:16:51 +0000 (16:16 +0100)]
auxiliary/os: get the mmap/munmap wrappers working with android
- Use macro for munmap under Android - the STATIC_ASSERT uses
a off_t which is not used under Android for mmap. As loff_t size
does not vary as does off_t just ignore the assert.
- Wrap the long lines to improve readability.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Mauro Rossi [Sun, 19 Oct 2014 15:16:49 +0000 (16:16 +0100)]
gallium/nouveau: fully build the driver under android
Fix the trivial typo in the variable name.
Cc: "10.2 10.3" <mesa-stable@lists.freedesktop.org>
Alon Levy [Tue, 22 Jul 2014 21:07:06 +0000 (00:07 +0300)]
mesa/shaderimage.c: fix inconsistent sign warning
Signed-off-by: Alon Levy <alevy@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Alon Levy [Tue, 22 Jul 2014 21:07:04 +0000 (00:07 +0300)]
wgl: stw_pixelformat_get_info: correct type for index variable
Signed-off-by: Alon Levy <alevy@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Alon Levy [Tue, 22 Jul 2014 21:07:03 +0000 (00:07 +0300)]
u_math.h: fix 64 to 32 bit truncation warning
Signed-off-by: Alon Levy <alevy@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
José Fonseca [Thu, 23 Oct 2014 09:42:12 +0000 (10:42 +0100)]
gallivm: Fix build with LLVM 3.3.
The setMCJITMemoryManager method doesn't exist in LLVM 3.3.
I thought I had tested the latest version of my earlier change with LLVM
3.3, but it looks I missed it.
Trivial.
José Fonseca [Wed, 22 Oct 2014 19:08:57 +0000 (20:08 +0100)]
gallivm: Properly update for removal of JITMemoryManager in LLVM 3.6.
JITMemoryManager was removed in LLVM 3.6, and replaced by its base class
RTDyldMemoryManager.
This change fixes our JIT memory managers specializations to derive from
RTDyldMemoryManager in LLVM 3.6 instead of JITMemoryManager.
This enables llvmpipe to run with LLVM 3.6.
However, lp_free_generated_code is basically a no-op because there are
not enough hook points in RTDyldMemoryManager to track and free the code
of a module. In other words, with MCJIT, code once created, stays
forever allocated until process destruction. This is not speicfic to
LLVM 3.6 -- it will happen whenever MCJIT is used regardless of version.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
José Fonseca [Wed, 22 Oct 2014 14:40:53 +0000 (15:40 +0100)]
gallivm: Fix white-space.
Replace tabs with spaces.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
José Fonseca [Wed, 22 Oct 2014 12:09:59 +0000 (13:09 +0100)]
gallivm,llvmpipe,clover: Bump required LLVM version to 3.3.
We'll need to update gallivm for the interface changes in LLVM 3.6, and
the fewer the number of older LLVM versions we support the less hairy that
will be.
As consequence HAVE_AVX define can disappear. (Note HAVE_AVX meant
whether LLVM version supports AVX or not. Runtime support for AVX is
always checked and enforced independently.)
Verified llvmpipe builds and runs with with LLVM 3.3, 3.4, and 3.5.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Ilia Mirkin [Wed, 22 Oct 2014 02:20:50 +0000 (22:20 -0400)]
mesa: remove conditional render and rgtc from ES3 requirements
The functionality exposed by those extensions does not appear in ES3
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Brian Paul [Tue, 21 Oct 2014 16:30:22 +0000 (10:30 -0600)]
u_blitter: put a comment on util_blitter_cache_all_shaders()
Trivial.
Brian Paul [Tue, 21 Oct 2014 16:26:24 +0000 (10:26 -0600)]
u_blitter: use ctx->bind_fs_state(), not pipe->bind_fs_state()
Consistently use the function pointer we saved earlier.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Brian Paul [Tue, 21 Oct 2014 18:14:06 +0000 (12:14 -0600)]
u_blitter: create basic fs shaders in util_blitter_cache_all_shaders()
We need to create all fs shaders in this function.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Brian Paul [Tue, 21 Oct 2014 16:22:35 +0000 (10:22 -0600)]
u_blitter: do error checking assertions for shader caching
If the user calls util_blitter_cache_all_shaders() set a flag and assert
that we never try to create any new fragment shaders after that point.
If the assertions fails, it means we missed generating some shader in
util_blitter_cache_all_shaders().
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Anuj Phogat [Mon, 22 Sep 2014 22:10:28 +0000 (15:10 -0700)]
glsl: Use signed array index in update_max_array_access()
Avoids a crash in case of negative array index is used in a
shader program.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Anuj Phogat [Thu, 18 Sep 2014 23:30:31 +0000 (16:30 -0700)]
glsl: Fix crash due to negative array index
Currently Mesa crashes with a shader like this:
[fragmnet shader]
float[5] array;
int idx = -2;
void main()
{
gl_FragColor = vec4(0.0, 1.0, 0.0, array[idx]);
}
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Marek Olšák [Wed, 22 Oct 2014 08:59:49 +0000 (10:59 +0200)]
radeonsi: implement pipe_rasterizer_state::clip_halfz
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Marek Olšák [Wed, 22 Oct 2014 08:59:49 +0000 (10:59 +0200)]
r600g: implement pipe_rasterizer_state::clip_halfz
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Marek Olšák [Wed, 22 Oct 2014 08:59:49 +0000 (10:59 +0200)]
r300g: implement pipe_rasterizer_state::clip_halfz
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Michel Dänzer [Tue, 21 Oct 2014 03:40:15 +0000 (12:40 +0900)]
r600g: Drop references to destroyed blend state
Fixes use-after-free when the currently bound blend state is destroyed.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85267
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84140
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Cc: mesa-stable@lists.freedesktop.org
Kenneth Graunke [Thu, 16 Oct 2014 01:57:07 +0000 (18:57 -0700)]
i965/vec4: Generate better code for ir_triop_csel.
Previously, we generated an extra CMP instruction:
cmp.ge.f0(8) g6<1>D g1<0,4,1>F 0F
cmp.nz.f0(8) null g6<4,4,1>D 0D
(+f0) sel(8) g5<1>F g1.4<0,4,1>F g2<0,4,1>F
The first operand is always a boolean, and we want to predicate the SEL
on that. Rather than producing a boolean value and comparing it against
zero, we can just produce a condition code in the flag register.
Now we generate:
cmp.ge.f0(8) null g1<0,4,1>F 0F
(+f0) sel(8) g5<1>F g1.4<0,4,1>F g2<0,4,1>F
No difference in shader-db.
v2: Remember to delete the old code (thanks Matt).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Thu, 16 Oct 2014 02:17:21 +0000 (19:17 -0700)]
i965/vec4: Simplify visit(ir_expression *)'s result_src/dst setup.
Using dst_reg(this, ir->type) automatically sets the writemask to the
proper size for the type; src_reg(dst_reg) preserves that. This should
be equivalent, but less code.
Note that src_reg(dst_reg) either uses SWIZZLE_XXXX or SWIZZLE_XYZW, so
the old code did need the manual writemask adjustment, since it
constructed the registers the other way around.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Thu, 16 Oct 2014 02:13:16 +0000 (19:13 -0700)]
i965/vec4: Delete some dead code in visit(ir_expression *).
Nothing uses the vector_elements temporary variable.
Setting this->result.file is dead because we overwrite this->result a
few lines later.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Thu, 16 Oct 2014 01:57:07 +0000 (18:57 -0700)]
i965/fs: Generate better code for ir_triop_csel.
Previously, we generated an extra CMP instruction:
cmp.ge.f0(8) g4<1>D g2<0,1,0>F 0F
cmp.nz.f0(8) null g4<8,8,1>D 0D
(+f0) sel(8) g120<1>F g2.4<0,1,0>F g3<0,1,0>F
The first operand is always a boolean, and we want to predicate the SEL
on that. Rather than producing a boolean value and comparing it against
zero, we can just produce a condition code in the flag register.
Now we generate:
cmp.ge.f0(8) null g2<0,1,0>F 0F
(+f0) sel(8) g124<1>F g2.4<0,1,0>F g3<0,1,0>F
total instructions in shared programs:
5473459 ->
5473253 (-0.00%)
instructions in affected programs: 6219 -> 6013 (-3.31%)
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Thu, 16 Oct 2014 16:28:42 +0000 (09:28 -0700)]
glsl: Delete unused gl_uniform_driver_format enum values.
A while back, Matt made the uniform upload functions simply upload
ctx->Const.UniformBooleanTrue for boolean values instead of 0/1, which
removed the need to convert it later. We also set UniformBooleanTrue to
1.0f for drivers which want to treat booleans as 0.0/1.0f.
Nothing ever sets these, so they are dead.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Rob Clark [Tue, 21 Oct 2014 21:08:10 +0000 (17:08 -0400)]
freedreno/a3xx: fix depth/stencil restore format
Also fix z16 restore format which was completely wrong.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Tue, 21 Oct 2014 16:25:28 +0000 (12:25 -0400)]
freedreno/a3xx: fix viewport state during clear
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Tue, 21 Oct 2014 15:28:53 +0000 (11:28 -0400)]
freedreno: mark scissor state dirty when enable bit changes
We don't have a scissor enable bit in hw, so when a raster state change
results in scissor enable bit changing, we need to also mark scissor
state as dirty.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Tue, 21 Oct 2014 14:30:49 +0000 (10:30 -0400)]
freedreno: clear vs scissor
The optimization of avoiding restore (mem2gmem) if there was a clear
falls down a bit if you don't have a fullscreen scissor. We need to
make the decision logic a bit more clever to keep track of *what* was
cleared, so that we can (a) completely skip mem2gmem if entire buffer
was cleared, or (b) skip mem2gmem on a per-tile basis for tiles that
were completely cleared.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Vinson Lee [Sun, 19 Oct 2014 07:13:33 +0000 (00:13 -0700)]
clover: Fix build error with LLVM 3.4.
DataLayoutPass was added in LLVM 3.5 r202168, commit
57edc9d4ff1648568a5dd7e9958649065b260dca "Make DataLayout a plain
object, not a pass.".
This patch fixes this build error with LLVM 3.4.
CXX llvm/libclllvm_la-invocation.lo
llvm/invocation.cpp: In function 'void {anonymous}::optimize(llvm::Module*, unsigned int, const std::vector<llvm::Function*>&)':
llvm/invocation.cpp:324:18: error: expected type-specifier
PM.add(new llvm::DataLayoutPass(mod));
^
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85189
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Marek Olšák [Tue, 23 Sep 2014 15:17:01 +0000 (17:17 +0200)]
r600g,radeonsi: convert TGSI shader type to LLVM shader type
The values are hardcoded in the LLVM backend, but the TGSI definitions are
going to be changed with tessellation, e.g. TGSI_PROCESSOR_COMPUTE will be
increased by 2.
We'll use VS for LS and HS, because there's nothing special about them
from the LLVM backend point of view, even though the hardware side is
different. We do the same for ES.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sun, 5 Oct 2014 22:19:31 +0000 (00:19 +0200)]
radeonsi: add some missing register definitions
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sun, 5 Oct 2014 11:33:40 +0000 (13:33 +0200)]
radeonsi: load ring resource descriptors only once
v2: document the new functions
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Fri, 26 Sep 2014 21:06:32 +0000 (23:06 +0200)]
radeonsi: clarify shader constant load functions
I'll need indexed loads without the meta data flag for tessellation later.
Also rename load_const to buffer_load_const to distinguish it from indexed
const loads.
v2: add comments
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Sun, 5 Oct 2014 10:38:54 +0000 (12:38 +0200)]
radeonsi: statically declare resource and sampler arrays
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Thu, 16 Oct 2014 14:20:26 +0000 (16:20 +0200)]
radeonsi: remove conversion of DX9 FACE input to GL
st/mesa and gallium expect the DX9 format, so this is useless.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Tue, 14 Oct 2014 20:51:10 +0000 (22:51 +0200)]
radeonsi: revert hack for random failures in glsl-max-varyings
This reverts commit
032e5548b3d4b5efa52359218725cb8e31b622ad.
I've run glsl-max-varyings 30 times and it always passed.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Tue, 14 Oct 2014 15:48:52 +0000 (17:48 +0200)]
radeonsi: generate shader pm4 states right after shader compilation
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Tue, 14 Oct 2014 15:36:30 +0000 (17:36 +0200)]
radeonsi: make pm4 state generation for shaders independent of the context
The si_pm4_delete_state calls became useless, because the pm4 state is
always generated only once.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Tue, 14 Oct 2014 15:31:00 +0000 (17:31 +0200)]
radeonsi: inline si_pm4_alloc_state
It seemed like the function needed a context pointer. Let's remove it
to make it less confusing.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Mon, 20 Oct 2014 13:41:42 +0000 (15:41 +0200)]
r300g: replace r300_get_num_samples with a util variant
Marek Olšák [Mon, 6 Oct 2014 19:12:14 +0000 (21:12 +0200)]
glsl_to_tgsi: use _mesa_copy_linked_program_data
This deduplicates some code.
Marek Olšák [Thu, 16 Oct 2014 14:21:54 +0000 (16:21 +0200)]
glsl_to_tgsi: fix the value of gl_FrontFacing with native integers
We must convert it to boolean from the DX9 float encoding that Gallium
specifies.
Later, we should probably define that FACE should be 0 or ~0 if native
integers are supported.
Cc: 10.2 10.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Marek Olšák [Sun, 5 Oct 2014 16:55:47 +0000 (18:55 +0200)]
st/mesa: add ST_DEBUG=wf option which enables wireframe rendering
Useful for tessellation.
Marek Olšák [Wed, 1 Oct 2014 18:28:17 +0000 (20:28 +0200)]
gallium: add PIPE_SHADER_CAP_MAX_OUTPUTS and use it in st/mesa
With 5 shader stages and various combinations of enabled and disabled shaders,
the maximum number of outputs in one shader doesn't have to be equal to
the maximum number of inputs in the following shader.
v2: return 32 for softpipe and llvmpipe
Eric Anholt [Tue, 21 Oct 2014 14:46:48 +0000 (15:46 +0100)]
vc4: Fix SRC_ALPHA_SATURATE blending.
Fixes glean blendFunc.
Eric Anholt [Mon, 20 Oct 2014 20:14:57 +0000 (21:14 +0100)]
vc4: Fix stencil writemask handling.
If the writemask doesn't compress, then we want to put in the uncompressed
writemask, not the compressed writemask failure value (all-on).
Fixes glean's stencil2 and fbo-clear-formats on stencil.
Eric Anholt [Mon, 20 Oct 2014 21:53:07 +0000 (22:53 +0100)]
vc4: Don't look at back stencil state unless two-sided stencil is enabled.
Fixes regressions in the next bugfix, because gallium util stuff leaves
the back stencil state as 0 if !back->enabled.
Rob Clark [Sun, 19 Oct 2014 18:55:32 +0000 (14:55 -0400)]
freedreno/ir3: add debug flag to disable cp
FD_MESA_DEBUG=nocp will disable copy propagation pass.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Ilia Mirkin [Fri, 3 Oct 2014 20:23:19 +0000 (16:23 -0400)]
freedreno: positions come out as integers, not half-integers
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Sat, 18 Oct 2014 20:52:44 +0000 (16:52 -0400)]
freedreno/a3xx: disable early-z when we have kill's
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Sat, 18 Oct 2014 19:28:16 +0000 (15:28 -0400)]
freedreno/ir3: fix potential gpu lockup with kill
It seems like the hardware is unhappy if we execute a kill instruction
prior to last input (ei). Probably the shader thread stops executing
and the end-input flag is never set.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Sat, 18 Oct 2014 18:46:35 +0000 (14:46 -0400)]
freedreno/ir3: comment + better fxn name
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Fri, 17 Oct 2014 12:57:16 +0000 (08:57 -0400)]
freedreno/a3xx: only emit dirty consts
If app only updates (for example) vertex uniforms, it would be nice to
only re-emit those and not also frag uniforms. Means we need to mark
the first frag shader const buffer dirty after a clear.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Wed, 15 Oct 2014 21:15:06 +0000 (17:15 -0400)]
freedreno/a3xx: more layer/level fixes
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Brian Paul [Mon, 20 Oct 2014 17:53:33 +0000 (11:53 -0600)]
mesa: fix 'feeedback' typo in comment
Trivial.
Brian Paul [Mon, 20 Oct 2014 17:49:17 +0000 (11:49 -0600)]
mesa: fix 'misalgned' typos in error messages
Trivial.
Brian Paul [Fri, 17 Oct 2014 19:31:53 +0000 (13:31 -0600)]
glsl: fix several use-after-free bugs
The get_variable_being_redeclared() function can free the 'var' argument.
Thereafter, we cannot assume that 'var' is a valid pointer. This patch
replaces 'var->name' with 'earlier->name' in two places and calls
is_gl_identifier(var->name) before 'var' might get freed.
This fixes several piglit GLSL crashes, including:
spec/glsl-1.50/execution/geometry/clip-distance-in-param
spec/glsl-1.50/execution/geometry/clip-distance-bulk-copy
spec/glsl-1.50/compiler/gs-redeclares-pervertex-out-before-global-redeclaration.geom
I'm not sure why these were not spotted sooner.
A similar bug was previously fixed by
f9cecca7a.
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Tapani Pälli [Tue, 14 Oct 2014 09:39:54 +0000 (12:39 +0300)]
mesa: validate sampler uniforms during gluniform calls
Patch fixes 'glsl-2types-of-textures-on-same-unit' in WebGL conformance
test suite. No Piglit regressions, fixes gl-2.0-active-sampler-conflict.
To avoid adding potentially heavy check during draw (valid_to_render),
check is done during uniform updates by inspecting TexturesUsed mask.
A new boolean variable is introduced to cache validation state.
v2: take into account case where 2 uniforms use same unit (curro)
also do the check only when SSO is not in use, SSO has own
path for sampler validation.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
EdB [Sat, 11 Oct 2014 16:01:36 +0000 (18:01 +0200)]
clover: Don't return CL_INVALID_VALUE if there is no header.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
EdB [Sat, 11 Oct 2014 22:58:39 +0000 (01:58 +0300)]
clover: Add allow_empty_tag.
To allow empty objs() list checks.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
EdB [Mon, 20 Oct 2014 07:34:17 +0000 (10:34 +0300)]
clover: Add initial implementation of clCompileProgram for CL 1.2.
[ Francisco Jerez: General clean-up. ]
Reviewed-by: Francisco Jerez <currojerez@riseup.net>