Anuj Phogat [Thu, 7 Mar 2013 22:05:38 +0000 (14:05 -0800)]
mesa: Fix FB blitting in case of zero size src or dst rect
Framebuffer blitting operation should be skipped if any of the
dimensions (width/height) of src/dst rect is zero.
V2: Move the dimension check after error checking in _mesa_BlitFramebuffer.
Fixes: fbblit(negative.nullblit.zeroSize) in Intel oglconform
https://bugs.freedesktop.org/show_bug.cgi?id=59495
Note: Candidate for all the stable branches.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Roland Scheidegger [Wed, 13 Mar 2013 21:10:18 +0000 (22:10 +0100)]
tgsi: fix sample_d emit for arrays
Those cases were apparently forgotten.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Wed, 13 Mar 2013 20:23:18 +0000 (21:23 +0100)]
llvmpipe: don't assert when trying to render to surfaces with multiple layers
instead just warn when creating the surface, rendering will simply happen
to first layer.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Wed, 13 Mar 2013 20:19:20 +0000 (21:19 +0100)]
softpipe: don't assert when creating surfaces with multiple layers
We can't handle them yet, however we can safely just warn (we will
just render to first layer, which is fine since we can't handle
rendertarget system value neither).
Also make behavior more predictable with buffer surfaces
(it would sometimes hit bogus asserts because of the union in the surface,
instead create the surface but assert when trying to set a buffer
in the framebuffer).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
José Fonseca [Wed, 13 Mar 2013 21:21:17 +0000 (21:21 +0000)]
llvmpipe: Fix geometry shader token leak.
Trivial. Matches softpipe's code.
Tom Stellard [Thu, 7 Mar 2013 21:51:14 +0000 (16:51 -0500)]
radeon/llvm: Add missing license headers
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>
Tom Stellard [Thu, 7 Mar 2013 21:51:13 +0000 (16:51 -0500)]
radeon/llvm: Make radeon_llvm_util.cpp a C file
All the functions in this file are now implemented in C.
Tom Stellard [Thu, 7 Mar 2013 21:51:12 +0000 (16:51 -0500)]
radeon/llvm: Optimize radeon_llvm_strip_unused_kernels()
Just delete unused kernels rather than marking them as internal and
running the GlobalDCE pass.
Also implement this function in C and inline it into
radeon_llvm_get_kernel_module()
Tom Stellard [Thu, 7 Mar 2013 21:51:11 +0000 (16:51 -0500)]
radeon/llvm: Implement radeon_llvm_get_kernel_module() using the C API
Tom Stellard [Thu, 7 Mar 2013 21:51:10 +0000 (16:51 -0500)]
radeon/llvm: Implement radeon_llvm_get_num_kernels() using the C API
Tom Stellard [Thu, 7 Mar 2013 21:51:09 +0000 (16:51 -0500)]
radeon/llvm: Implement radeon_llvm_parse_bitcode() using C API
Also make the function static since it is not used anywhere else.
Tom Stellard [Thu, 7 Mar 2013 21:51:08 +0000 (16:51 -0500)]
r600g/llvm: Move llvm wrapper functions into the radeon directory
Jon TURNEY [Wed, 27 Feb 2013 15:32:37 +0000 (15:32 +0000)]
Properly check GLX_INDIRECT_RENDERING in glapi/tests/check_table
Actually use $DEFINES, so we can see if GLX_INDIRECT_RENDERING is defined
If GLX_INDIRECT_RENDERING is defined, _GLAPI_SKIP_PROTO_ENTRY_POINTS will
be defined, and libglapi won't contain the 'protocol entry points', so we
should provide stubs in check_table.cpp
Jon TURNEY [Wed, 27 Feb 2013 12:58:17 +0000 (12:58 +0000)]
Fix glapi/tests/check_table.cpp for standardized OpenGL function names
It looks like this has been broken since commit
1a1db1746db82efc7f0643508886dfc78a15eb71 "Standardize names of OpenGL
functions."
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Jon TURNEY [Tue, 26 Feb 2013 16:02:13 +0000 (16:02 +0000)]
Fix out-of-tree build of 'make check' in src/mapi/glapi/tests/
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
José Fonseca [Wed, 13 Mar 2013 13:13:08 +0000 (13:13 +0000)]
scons: Define PACKAGE_VERSION/BUGREPORT globally.
Fixes the scons build.
Vinson Lee [Wed, 13 Mar 2013 05:32:47 +0000 (22:32 -0700)]
tests: Add $(top_srcdir)/include to AM_CPPFLAGS.
Fixes this build error with make check.
CC collision.o
In file included from ../../../../../src/mesa/main/hash_table.h:34:0,
from collision.c:31:
../../../../../src/mesa/main/compiler.h:51:53: fatal error: c99_compat.h: No such file or directory
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
José Fonseca [Wed, 13 Mar 2013 01:25:30 +0000 (01:25 +0000)]
scons: Define PACKAGE_xxx
Should get the builds going again.
Brian Paul [Tue, 12 Mar 2013 00:31:22 +0000 (18:31 -0600)]
docs: rewrite the OSMesa info / instructions
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Tue, 12 Mar 2013 00:31:21 +0000 (18:31 -0600)]
configure: wire-up new OSMesa gallium state tracker and target
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Tue, 12 Mar 2013 00:31:21 +0000 (18:31 -0600)]
target/osmesa: add new Makefile.am
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Tue, 12 Mar 2013 00:31:21 +0000 (18:31 -0600)]
targets/osmesa: new OSMesa gallium target
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Tue, 12 Mar 2013 00:31:21 +0000 (18:31 -0600)]
st/osmesa: add new Makefile.am
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Tue, 12 Mar 2013 00:31:21 +0000 (18:31 -0600)]
st/osmesa: new OSMesa gallium state tracker
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Tue, 12 Mar 2013 00:31:21 +0000 (18:31 -0600)]
st/mesa: add PIPE_FORMAT_R16G16B16A16_UNORM renderbuffer support
To allow rendering in 16-bit/channel RGBA buffers.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
José Fonseca [Wed, 13 Mar 2013 00:31:03 +0000 (00:31 +0000)]
scons: Re-add ','
José Fonseca [Wed, 13 Mar 2013 00:16:24 +0000 (00:16 +0000)]
autotools: Add missing top-level include dir.
Fixes autotools build failure. Not sure if there are more, as I have
difficulties in building the full tree.
Matt Turner [Wed, 13 Mar 2013 00:09:55 +0000 (17:09 -0700)]
configure.ac: Alphabetize freedreno makefiles.
Matt Turner [Fri, 22 Feb 2013 00:51:19 +0000 (16:51 -0800)]
build: Get rid of dead MESA_ASM_FILES variable
Reviewed-by: Eric Anholt <eric@anholt.net>
Matt Turner [Fri, 22 Feb 2013 00:51:03 +0000 (16:51 -0800)]
mesa/build: Get rid of dead ALL_FILES variable
Reviewed-by: Eric Anholt <eric@anholt.net>
Matt Turner [Fri, 22 Feb 2013 01:03:18 +0000 (17:03 -0800)]
xmlpool/.gitignore: Remove 'Makefile'
Handled by top level .gitignore.
Reviewed-by: Eric Anholt <eric@anholt.net>
Matt Turner [Sat, 9 Mar 2013 08:28:09 +0000 (00:28 -0800)]
mesa: Use PACKAGE_BUGREPORT macro.
Reviewed-by: Eric Anholt <eric@anholt.net>
Matt Turner [Sat, 9 Mar 2013 08:23:20 +0000 (00:23 -0800)]
mesa: Remove unused version #defines from version.h.
Reviewed-by: Eric Anholt <eric@anholt.net>
Matt Turner [Sat, 9 Mar 2013 08:25:45 +0000 (00:25 -0800)]
mesa: Replace MESA_VERSION with PACKAGE_VERSION.
One fewer place to have to update.
Reviewed-by: Eric Anholt <eric@anholt.net>
Zack Rusin [Tue, 12 Mar 2013 20:41:35 +0000 (13:41 -0700)]
draw/so: Fix stream output with geometry shaders
If geometry shader is present its stream output info should
be used instead of the vs and we shouldn't use the pre-clipped
corrdinates.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
José Fonseca [Tue, 12 Mar 2013 20:37:47 +0000 (20:37 +0000)]
include: Fix build with VS 11 (i.e, 2012).
NOTE: Candidate for the stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
José Fonseca [Tue, 12 Mar 2013 11:17:49 +0000 (11:17 +0000)]
mesa,gallium,egl,mapi: One definition of C99 inline/__func__ to rule them all.
We were in four already...
NOTE: Candidate for the stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
José Fonseca [Tue, 12 Mar 2013 20:33:38 +0000 (20:33 +0000)]
scons: Allows choosing VS 10 or 11.
NOTE: Candidate for the stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
Michel Dänzer [Tue, 12 Mar 2013 11:34:37 +0000 (12:34 +0100)]
radeonsi: Fix off-by-one for maximum vertex element index in some cases
In cases where the vertex element size is smaller than the vertex buffer
stride, the previous calculation could end up 1 too low. This would result
in the GPU using index 0 instead of the maximum index for those elements,
which would be visible as intermittent distorted triangles.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Christoph Bumiller [Mon, 11 Mar 2013 19:53:25 +0000 (20:53 +0100)]
nvc0: avoid crash on updating RASTERIZE_ENABLE state
When doing a blit with the 3D engine, the rasterizer or zsa cso may
be NULL.
Christoph Bumiller [Fri, 1 Mar 2013 15:45:47 +0000 (16:45 +0100)]
gallium/tests: check format in compute tests, make selectable
Christoph Bumiller [Sat, 9 Mar 2013 16:17:14 +0000 (17:17 +0100)]
nvc0: add MP trap handler for nve4
Christoph Bumiller [Sat, 9 Mar 2013 11:11:38 +0000 (12:11 +0100)]
nvc0: they removed the NTID,NCTAID,GRIDID registers on nve4
Christoph Bumiller [Sat, 23 Feb 2013 18:40:23 +0000 (19:40 +0100)]
nvc0: implement compute support for nve4
Christoph Bumiller [Mon, 11 Mar 2013 16:34:43 +0000 (17:34 +0100)]
nvc0/ir: try to fix CAS (CompareAndSwap)
Christoph Bumiller [Mon, 11 Mar 2013 16:34:05 +0000 (17:34 +0100)]
nv50/ir: add CCTL (cache control) op
Christoph Bumiller [Mon, 11 Mar 2013 16:32:52 +0000 (17:32 +0100)]
nvc0/ir/emit: fix emission of large address offsets
Christoph Bumiller [Fri, 8 Mar 2013 21:40:30 +0000 (22:40 +0100)]
nvc0: add SHADER/COMPUTE_RESOURCE bind flags to format table
Christoph Bumiller [Sat, 2 Mar 2013 17:27:56 +0000 (18:27 +0100)]
nouveau: align PIPE_BIND_SHADER,COMPUTE_RESOURCEs to 256 bytes
Christoph Bumiller [Fri, 1 Mar 2013 20:37:37 +0000 (21:37 +0100)]
nv50,nvc0: copy writable flag on surface creation
Christoph Bumiller [Sat, 2 Mar 2013 20:00:26 +0000 (21:00 +0100)]
nv50/ir: add support for different sampler and resource index on nve4
And remove non-working code for indirect sampler/resource selection.
Will be added back later.
Includes code from "nv50/ir/tgsi: Resource indirect indexing" by
Francisco Jerez (when mixing the R and S handles we can only specify
them via a register, i.e. indirectly, unless we upload all the used
handle combinations to c[] space, which we don't for now).
Christoph Bumiller [Sat, 2 Mar 2013 13:59:06 +0000 (14:59 +0100)]
nv50/ir: implement splitting of 64 bit ops after RA
Christoph Bumiller [Thu, 28 Feb 2013 21:08:36 +0000 (22:08 +0100)]
nvc0/ir: skip back edges when determining latest sched value
Christoph Bumiller [Thu, 28 Feb 2013 18:07:24 +0000 (19:07 +0100)]
nvc0/ir: use large issue delay after RET, too
Christoph Bumiller [Thu, 28 Feb 2013 18:00:02 +0000 (19:00 +0100)]
nv50/ir: fix size adjustment for sched info for multiple functions
Christoph Bumiller [Wed, 27 Feb 2013 20:02:29 +0000 (21:02 +0100)]
nv50/ir: print function inputs and outputs
Christoph Bumiller [Wed, 27 Feb 2013 14:32:35 +0000 (15:32 +0100)]
nv50/ir/ssa: add a few comments regarding RenamePass
Francisco Jerez [Mon, 25 Feb 2013 20:57:32 +0000 (21:57 +0100)]
nv50/ir/tgsi: Exclude local declarations from function prototypes.
Christoph Bumiller [Mon, 25 Feb 2013 14:52:10 +0000 (15:52 +0100)]
nv50/ir/opt: try to make use of SUCLAMP addend
Christoph Bumiller [Sun, 24 Feb 2013 17:36:44 +0000 (18:36 +0100)]
nv50/ir: don't assert on type in Modifier.applyTo if it is 0
Christoph Bumiller [Sat, 23 Feb 2013 12:09:32 +0000 (13:09 +0100)]
nv50/ir: add support for barriers
nv50 part by Francisco Jerez.
Christoph Bumiller [Wed, 20 Feb 2013 20:33:38 +0000 (21:33 +0100)]
nv50/ir/tgsi: add support for atomics
Christoph Bumiller [Fri, 22 Feb 2013 23:39:23 +0000 (00:39 +0100)]
nv50/ir/tgsi: handle TGSI_OPCODE_LOAD,STORE
Squashed and (heavily) modified original patches by Francisco Jerez:
nv50/ir/tgsi: Implement resource LOAD/STORE (wip).
nv50/ir/tgsi: Emit SUST/SULD for surface access, and add CB LOAD/STORE support
nv50/ir/tgsi: Fix/clean up the LOAD/STORE handling code.
Left out for now:
nv50/ir/tgsi: Resource indirect indexing
Treating raw, read-only surfaces as constant buffers (CBs) was removed
because CBs are limited to a size of 64 KiB which isn't desireable, and
because this decision should probably be made by the state tracker.
If we used a number of CB slots for surfaces, it might find that we
cannot accomodate the advertised limit.
Christoph Bumiller [Thu, 28 Feb 2013 20:05:45 +0000 (21:05 +0100)]
nvc0/ir: don't replace load from input in COMPUTE progs with VFETCH
Christoph Bumiller [Fri, 22 Feb 2013 23:00:27 +0000 (00:00 +0100)]
nvc0/ir: implement lowering of surface ops for nve4
Christoph Bumiller [Tue, 19 Feb 2013 21:12:01 +0000 (22:12 +0100)]
nvc0/ir: add formatted surface load lib code, move to extra header
OpenGL is nice and makes the user specify a format with an image unit.
OpenCL is evil and doesn't, and what's better than adding a huge load
of functions that we call indirectly to handle the conversion ?
Christoph Bumiller [Sun, 17 Feb 2013 11:01:55 +0000 (12:01 +0100)]
nv50/ir: extend moveSources for delta < 0
Christoph Bumiller [Fri, 22 Feb 2013 19:46:28 +0000 (20:46 +0100)]
nvc0/ir: lower atomics in s[]
Christoph Bumiller [Fri, 22 Feb 2013 19:35:32 +0000 (20:35 +0100)]
nvc0/ir/emit: implement INSBF, EXTBF, PERMT and ATOM
Christoph Bumiller [Wed, 20 Feb 2013 19:54:14 +0000 (20:54 +0100)]
nv50/ir/emit: handle OP_ATOM
Christoph Bumiller [Fri, 8 Mar 2013 18:08:23 +0000 (19:08 +0100)]
nvc0/ir/target: some ops can't be predicated, e.g. CALL
Christoph Bumiller [Tue, 26 Feb 2013 20:05:03 +0000 (21:05 +0100)]
nv50/ir/opt: CALLs cannot load
Christoph Bumiller [Fri, 22 Feb 2013 19:08:57 +0000 (20:08 +0100)]
nv50/ir: add support for indirect BRA,CALL
Christoph Bumiller [Fri, 22 Feb 2013 18:10:20 +0000 (19:10 +0100)]
nvc0/ir/emit: implement move to and logic ops on predicates
Christoph Bumiller [Fri, 22 Feb 2013 18:05:16 +0000 (19:05 +0100)]
nvc0/ir/emit: implement surface related ops
Christoph Bumiller [Mon, 25 Feb 2013 11:52:43 +0000 (12:52 +0100)]
nv50/ir: initialize CodeEmitters' specialized target fields
Christoph Bumiller [Wed, 20 Feb 2013 20:03:30 +0000 (21:03 +0100)]
nv50/ir/opt: make optimization aware of atomics, barriers, surface ops
Christoph Bumiller [Fri, 22 Feb 2013 17:45:16 +0000 (18:45 +0100)]
nv50/ir: add various new OPs that will be needed for compute
Francisco Jerez [Fri, 18 May 2012 14:17:44 +0000 (16:17 +0200)]
nv50/ir: Rename "mkLoad" to "mkLoadv" for consistency.
Christoph Bumiller [Sun, 24 Feb 2013 17:36:21 +0000 (18:36 +0100)]
nv50/ir: fix comparison of system values
Francisco Jerez [Tue, 6 Mar 2012 19:18:12 +0000 (20:18 +0100)]
nv50/ir/tgsi: Translate grid-related system parameters.
Francisco Jerez [Mon, 14 Nov 2011 23:12:20 +0000 (00:12 +0100)]
nv50/ir/tgsi: Accept COMPUTE programs.
Christoph Bumiller [Wed, 27 Feb 2013 20:08:57 +0000 (21:08 +0100)]
nv50/ir/ra: make sure all used function inputs get assigned a reg
A live range [0, 0) counts as empty. For function inputs this can
be a problem, so insert a nop at the beginning to make it [0, 1).
This is a bit of a hack but also the most simple solution.
Christoph Bumiller [Mon, 25 Feb 2013 13:45:52 +0000 (14:45 +0100)]
nv50/ir/ra: also add pre-existing MERGE,SPLIT to constraint list
Christoph Bumiller [Wed, 6 Feb 2013 16:14:55 +0000 (17:14 +0100)]
nv50/ir/ra: fix confusion with conditional RegisterSet::occupy
Christoph Bumiller [Thu, 28 Feb 2013 22:41:41 +0000 (23:41 +0100)]
nv50/ir/ra: swap copyCompound args if src is compound and dst isn't
Francisco Jerez [Mon, 30 Apr 2012 13:22:27 +0000 (15:22 +0200)]
nv50/ir/ra: Fix maxGPR calculation for programs with multiple functions.
Francisco Jerez [Mon, 30 Apr 2012 13:19:40 +0000 (15:19 +0200)]
nv50/ir/ra: Fix traversal before the beginning of the active list in buildRIG.
Francisco Jerez [Mon, 30 Apr 2012 13:13:07 +0000 (15:13 +0200)]
nv50/ir/ra: Fix RegisterSet::occupy(const Value *v).
Francisco Jerez [Mon, 30 Apr 2012 13:12:15 +0000 (15:12 +0200)]
nv50/ir/ra: Fix argument const-ness in RegisterSet::idToUnits and idToBytes
Francisco Jerez [Wed, 6 Feb 2013 13:12:44 +0000 (14:12 +0100)]
nv50/ir/opt: Fix tryPropagateBranch for BBs with several exit branches.
Comments and "if (bf->cfg.incidentCount() == 1)" condition added
by Christoph Bumiller.
Francisco Jerez [Mon, 30 Apr 2012 13:06:52 +0000 (15:06 +0200)]
nv50/ir: Clean up references to function values before destroying them.
Francisco Jerez [Wed, 25 Apr 2012 21:48:47 +0000 (23:48 +0200)]
nouveau: Bail out from nouveau_fence_wait if flushing the pushbuf fails.
Vinson Lee [Mon, 11 Mar 2013 05:51:23 +0000 (22:51 -0700)]
mesa: Use correct functions for enum conversion.
Fixes mixing enum types defects reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Rob Clark [Sat, 27 Oct 2012 16:07:34 +0000 (11:07 -0500)]
freedreno: gallium driver for adreno
Currently works on a220. Others in the a2xx family look pretty similar
and should be pretty straightforward to support with the same driver.
The a3xx has a new shader ISA, and while many registers appear similar,
the register addresses have been completely shuffled around. I am not
sure yet whether it is best to support with the same driver, but
different compiler, or whether it should be split into a different
driver.
v1: original
v2: build file updates from review comments, and remove GPL licensed
header files from msm kernel
v3: smarter temp/pred register assignment, fix clear and depth/stencil
format issues, resource_transfer fixes, scissor fixes
Signed-off-by: Rob Clark <robdclark@gmail.com>
José Fonseca [Mon, 11 Mar 2013 10:13:47 +0000 (10:13 +0000)]
d3d1x: Remove.
Unused/unmaintained.
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
José Fonseca [Mon, 11 Mar 2013 10:14:19 +0000 (10:14 +0000)]
nv50: Remove nv0_ir_from_sm4.*
Unused, depends on d3d1x.
Reviewed-by: Christoph Bumiller <e0425955@student.tuwien.ac.at>
Roland Scheidegger [Sat, 9 Mar 2013 00:46:33 +0000 (01:46 +0100)]
gallivm: clean up passing derivatives around
Previously, the derivatives were calculated and passed in a packed form
to the sample code (for implicit derivatives, explicit derivatives were
packed to the same format).
There's several reasons why this wasn't such a good idea:
1) the derivatives may not even be needed (not as bad as it sounds since
llvm will just throw the calculations needed for them away but still)
2) the special packing format really shouldn't be part of the sampler
interface
3) depending what the sample code actually does the derivatives will
be processed differently, hence there is no "ideal" packing. For cube
maps with explicit derivatives (which we don't do yet) for instance the
packing looked downright useless, and for non-isotropic filtering we'd
need different calculations too.
So, instead just pass the derivatives as is (for explicit derivatives),
or let the rho calculating sample code calculate them itself. This still
does exactly the same packing stuff for implicit derivatives for now,
though explicit ones are handled in a more straightforward manner (quick
estimates show performance should be quite similar, though it is much
easier to follow and also does the rho calculation per-pixel until the
end, which we eventually need for spec compliance anyway).
No piglit changes.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Chad Versace [Thu, 21 Feb 2013 03:59:07 +0000 (19:59 -0800)]
i965: Fix typo in doxygen hyperlink
s/brw_state_upload/brw_upload_state/
Found because the link was broken.
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Eric Anholt [Wed, 20 Feb 2013 01:01:41 +0000 (17:01 -0800)]
mesa: Reduce memory usage for reg alloc with many graph nodes (part 2).
After the previous fix that almost removes an allocation of 4*n^2
bytes, we can use a bitset to reduce another allocation from n^2 bytes
to n^2/8 bytes.
Between the previous commit and this one, the peak heap size for an
oglconform ARB_fragment_program max instructions test on i965 goes from
4GB to 255MB.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55825
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>