Michel Dänzer [Tue, 6 Aug 2013 08:45:50 +0000 (10:45 +0200)]
radeonsi: Number of SGPRs retrieved from LLVM already includes VCC
Fixes spurious 'Assertion `num_sgprs <= 104' failed.' with shaders using
all 104 SGPRs.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Christian König <christian.koenig@amd.com>
Kenneth Graunke [Fri, 2 Aug 2013 07:11:10 +0000 (00:11 -0700)]
i965: Don't allocate curbe buffers on Gen6+.
These are only used on Gen4-5. Why waste the 8kB of space?
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Vinson Lee [Sun, 4 Aug 2013 08:18:28 +0000 (01:18 -0700)]
llvmpipe: Do not need to free anything if there is no geometry shader.
If gs is null, then freeing state->shader.tokens would result in a null
dereference.
Fixes "Dereference after null check" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Vinson Lee [Sun, 4 Aug 2013 07:13:53 +0000 (00:13 -0700)]
nvc0: Initialize ptr for unexpected sample_count on release builds.
Fixes "Uninitialized pointer read" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Vinson Lee [Tue, 6 Aug 2013 00:33:51 +0000 (17:33 -0700)]
draw: Change slot from unsigned to int.
unfilled_stage::face_slot is of type int.
Fixes "Unsigned compared against 0" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Vinson Lee [Sat, 3 Aug 2013 06:39:24 +0000 (23:39 -0700)]
postprocess: Check ppq is null before calling pp_free_bos.
pp_free_bos dereferences ppq without a null check.
Fixes "Dereference before null check" defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Zack Rusin [Sat, 3 Aug 2013 06:56:19 +0000 (02:56 -0400)]
draw: add back separate input assembler
the issue is that stream output is run before the pipeline, which
means that unless we decompose the primitives before the so
then things crash. we could convert the entire stream output
code into a pipeline stage but it will take a bit, so for now
fix the crashes by simply re-adding the old input assembler
which is run before the SO.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 06:25:42 +0000 (02:25 -0400)]
draw: implement proper primitive assembler as a pipeline stage
we used to have a face primitive assembler that we ran after if
the gs was missing but we had adjacency primitives in the pipeline,
lets convert it to a pipeline stage, which allows us to use it
to inject outputs (primitive id) into the vertices. it's also
a lot cleaner because the decomposition is already handled for us.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 05:50:05 +0000 (01:50 -0400)]
draw: fix front face injection
Inject front face only if the fragment shader uses it and
propagate through all channels because otherwise we'll
need to figure out the exact swizzle that the fs expects and
it's just simpler to make sure all the components within
the front face register are correctly set.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Brian Paul [Fri, 2 Aug 2013 14:00:54 +0000 (08:00 -0600)]
tgsi: remove unneeded File == TGSI_FILE_INPUT test
We're already in an "if (File == TGSI_FILE_INPUT)" block at that point.
Brian Paul [Mon, 5 Aug 2013 14:19:36 +0000 (08:19 -0600)]
tgsi: clean up tgsi_scan_shader() function
Replace "fulldecl->Semantic.Name/Index" with semName/semIndex.
Simplify if/else logic for TGSI_FILE_OUTPUT code.
Remove old comment.
Fix indentation.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Zack Rusin [Sat, 3 Aug 2013 02:08:25 +0000 (22:08 -0400)]
llvmpipe: fix frontface behavior again
Lets make sure the frontface is 1 for front and -1 for back.
Discussed with Roland and Jose.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Vinson Lee [Sun, 4 Aug 2013 06:58:43 +0000 (23:58 -0700)]
r600g/sb: Dump correct value for CND.
Fixes "Copy-paste error" reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Vadim Girlin <vadimgirlin@gmail.com>
Jordan Justen [Mon, 29 Jul 2013 20:48:26 +0000 (13:48 -0700)]
intel_fbo: remove unused intel_renderbuffer hiz functions
We are now using functions that operate on the renderbuffer
attachment to handle layered rendering.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Jordan Justen [Mon, 29 Jul 2013 20:58:03 +0000 (13:58 -0700)]
i965 clear/draw: set renderbuffer attachment as needing depth resolve
Previously we would mark a renderbuffer as needing a depth resolve.
But, to support layered rendering, we need to look at the attachment
instead, since the attachment knows if layered rendering is being
used.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Jordan Justen [Mon, 29 Jul 2013 20:51:31 +0000 (13:51 -0700)]
i965: add intel_renderbuffer_att_set_needs_depth_resolve
This function is needed to support layered rendering. With
layered rendering, the attachment stores the state of whether
layered rendering is being used.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Jordan Justen [Mon, 29 Jul 2013 20:54:47 +0000 (13:54 -0700)]
i965: add intel_miptree_set_all_slices_need_depth_resolve
This function marks all slices of a renderbuffer at a particular
level as needing a depth resolve.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Jordan Justen [Fri, 19 Apr 2013 08:13:31 +0000 (01:13 -0700)]
i965 gen7: don't set FORCE_ZERO_RTAINDEX for layered rendering
When layered rendering is being used, we should not set
FORCE_ZERO_RTAINDEX in the clip state to allow render target
array values other than zero to be used.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Mon, 15 Jul 2013 23:37:15 +0000 (16:37 -0700)]
hsw hiz: Remove x/y offset restriction for hiz
This restriction was related to programming the offset fields
of the depth buffer packet. We are now setting these offsets
to 0 now, so this restriction should no longer be required.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Tue, 9 Jul 2013 22:36:32 +0000 (15:36 -0700)]
gen7 depth surface: program 3DSTATE_DEPTH_BUFFER to top of surface
Previously we would always find the 2D sub-surface of interest,
and then program the surface to this location. Now we always
program the 3DSTATE_DEPTH_BUFFER at the start of the surface.
To select the lod/slice, we utilize the lod & minimum array
element fields.
As part of this change, we must revert
1f112ccf:
Revert "i965/gen7: Align all depth miplevels to 8 in the X direction."
We also must disable brw_workaround_depthstencil_alignment for
gen >= 7. Now the hardware will handle alignment when rendering
to additional slices/LODs.
v2:
* Merge with recent MOCS changes
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Fri, 19 Jul 2013 22:44:56 +0000 (15:44 -0700)]
gen7 fbo: make unmatched depth/stencil configs return unsupported
For gen >= 7, we will use the lod/minimum-array-element fields to
support layered rendering. This means that we must restrict
the depth & stencil attachments to match in various more retrictive
ways. (Now the width, height, depth, LOD and layer must match)
The reason width, height, and depth must match is that the hardware
has a single set of width, height, and depth settings (in
3DSTATE_DEPTH_BUFFER) that affect both the depth and stencil buffers.
Since these controls determine the miptree layout, they need to be
set correctly in order for lod and minimum-array-element to work
properly. So the only way rendering can work is if the width,
height, and depth match.
In the future, if this restriction proves to be a problem (say
because some crucial client application relies on rendering to
different levels/layers of stencil and depth buffers), then we can
always work around the restriction by copying depth and/or stencil
data to a temporary buffer prior to rendering (much in the same way
that brw_workaround_depthstencil_alignment() does today for
gen < 7), but hopefully that won't be necessary.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Tue, 16 Jul 2013 07:01:05 +0000 (00:01 -0700)]
hsw hiz: Add new size restrictions for miplevels > 0
When performing hiz ops, we must ensure that the region sizes
have an 8 aligned width and 4 aligned height. We can tweak the
size for blorp hiz operations at LOD 0, but for the others we
can't. Therefore, we disable hiz for these miplevels if they
don't meet the size alignment requirements.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Tue, 9 Jul 2013 22:32:42 +0000 (15:32 -0700)]
gen7 blorp depth: calculate base surface width/height
This will be used in 3DSTATE_DEPTH_BUFFER in a later patch.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Tue, 9 Jul 2013 22:24:56 +0000 (15:24 -0700)]
gen7 depth surface: calculate minimum array element being rendered
In layered rendering this will be 0. Otherwise it will be the
selected slice.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Tue, 9 Jul 2013 22:19:55 +0000 (15:19 -0700)]
gen7 depth surface: calculate LOD being rendered to
This will be used in 3DSTATE_DEPTH_BUFFER in a later patch.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Tue, 9 Jul 2013 22:16:35 +0000 (15:16 -0700)]
gen7 depth surface: calculate depth (array size) for depth surface
This will be used in 3DSTATE_DEPTH_BUFFER in a later patch.
Note: Cube maps are treated as 2D arrays with 6 times as
many array elements as the cube map array would have.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Tue, 9 Jul 2013 21:56:38 +0000 (14:56 -0700)]
gen7 depth surface: calculate more specific surface type
This will be used in 3DSTATE_DEPTH_BUFFER in a later patch.
Note: Cube maps are treated as 2D arrays with 6 times as
many array elements as the cube map array would have.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Jordan Justen [Tue, 9 Jul 2013 21:25:11 +0000 (14:25 -0700)]
i965: init global state first in brw_workaround_depthstencil_alignment
In a future pass this will allow us to exit-early from this
routine to disable it for gen >= 7.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Ilia Mirkin [Thu, 1 Aug 2013 16:50:10 +0000 (12:50 -0400)]
nv50: fix some h264 interlaced decoding on vp2
Some videos specify mb_adaptive_frame_field_flag instead of
field_pic_flag. This implies that the pic height needs to be halved, and
this field needs to be passed to the VP engine.
Cc: "9.2" mesa-stable@lists.freedesktop.org
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Zack Rusin [Fri, 2 Aug 2013 05:53:15 +0000 (01:53 -0400)]
llvmpipe: don't interpolate front face or prim id
The loop was iterating over all the fs inputs and setting them
to perspective interpolation, then after the loop we were
creating extra output slots with the correct interpolation. Instead
of injecting bogus extra outputs, just set the interpolation
on front face and prim id correctly when doing the initial scan
of fs inputs.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 05:45:45 +0000 (01:45 -0400)]
draw: make sure clipping works with injected outputs
clipping would drop the extra outputs because it always
used the number of standard vertex shader outputs, without
geometry shader or extra outputs. The commit makes sure
that clipping with geometry shaders which have more outputs
than the current vertex shader and with extra outputs correctly
propagates the entire vertex.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Zack Rusin [Wed, 31 Jul 2013 11:34:49 +0000 (07:34 -0400)]
draw: inject frontface info into wireframe outputs
Draw module can decompose primitives into wireframe models, which
is a fancy word for 'lines', unfortunately that decomposition means
that we weren't able to preserve the original front-face info which
could be derived from the original primitives (lines don't have a
'face'). To fix it allow draw module to inject a fake face semantic
into outputs from which the backends can figure out the original
frontfacing info of the primitives.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 05:39:35 +0000 (01:39 -0400)]
draw: stop crashing with extra shader outputs
Draw sometimes injects extra shader outputs (aa points, lines or
front face), unfortunately most of the pipeline and llvm code
didn't handle them at all. It only worked if number of inputs
happened to be bigger or equal to the number of shader outputs
plus the extra injected outputs. In particular when running
the pipeline which depends on the vertex_id in the vertex_header
things were completely broken. The patch adjust the code to
correctly use the total number of shader outputs (the standard
ones plus the injected ones) to make it all stop crashing and
work.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 05:48:36 +0000 (01:48 -0400)]
draw: use the vertex size
Instead of using the magical 4 use the above computed
vertex size. Doesn't change the behavior, just makes the code
a bit cleaner.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 05:43:43 +0000 (01:43 -0400)]
draw/llvm: add some extra debugging output
when dumping shader outputs it's nice to have the integer
values of the outputs, in particular because some values
are integers.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 05:24:41 +0000 (01:24 -0400)]
tgsi: detect prim id and front face usage in fs
Adding code to detect the usage of prim id and front face
semantics in fragment shaders.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 23:06:46 +0000 (19:06 -0400)]
tgsi: add ucmp to the list of opcodes
we forgot to add ucmp to the list of opcodes, so it was never
generated for ureg.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Zack Rusin [Fri, 2 Aug 2013 19:50:16 +0000 (15:50 -0400)]
llvmpipe: make the front-face behavior match the gallium spec
The spec says that front-face is true if the value is >0 and false
if it's <0. To make sure that we follow the spec, lets just
subtract 0.5 from our value (llvmpipe did 1 for frontface and 0
otherwise), which will get us a positive num for frontface and
negative for backface.
Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Matt Turner [Thu, 1 Aug 2013 21:29:05 +0000 (14:29 -0700)]
Makefile.am: Remove api_exec_es* from EXTRA_FILES.
These files were removed in commits
a0102154 and
a8ab7e33.
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Matt Turner [Mon, 11 Mar 2013 21:57:16 +0000 (14:57 -0700)]
mesa: Use MIN3 instead of two MIN2s.
Matt Turner [Thu, 1 Aug 2013 21:20:23 +0000 (14:20 -0700)]
mesa: Update comments to match newer specs.
Old GL 1.x specs used 'b' but newer specs use 'p'. The line immediately
above the second hunk also uses 'p'.
Kenneth Graunke [Fri, 2 Aug 2013 07:01:41 +0000 (00:01 -0700)]
i965: Initialize the maximum number of GS threads on Haswell.
We'll need proper values for max_gs_threads when we eventually support
geometry shaders. Also, we initialize it for every other platform.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Kenneth Graunke [Fri, 2 Aug 2013 08:28:58 +0000 (01:28 -0700)]
glsl: Disallow interpolation qualifiers on non-input/output variables.
Commit
2548092ad8015 switched the sense of interpolation qualifier
checks in order to permit them on geometry shader in/out variables.
In doing so, it accidentally allowed interpolation qualifiers to be
applied to ordinary variables and function parameters.
Fixes a regression in Piglit's local-smooth-01.frag.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Kenneth Graunke [Fri, 2 Aug 2013 07:35:05 +0000 (00:35 -0700)]
glsl: Fix NULL pointer dereferences when linking fails.
Commit
7cfefe6965d50 introduced a check for whether linked->Type equals
GL_GEOMETRY_SHADER. However, linked may be NULL due to an earlier error
condition.
Since the entire function after the error path is (or should be) guarded
by linked != NULL checks, we may as well just return early and remove
the checks.
Fixes crashes in 9 Piglit tests.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Andreas Boll [Fri, 2 Aug 2013 09:22:23 +0000 (11:22 +0200)]
docs: Document UVD (2.2 and 3.0) video decoding support in mesa 9.2
Cc: "9.2" mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Andreas Boll [Fri, 2 Aug 2013 09:22:09 +0000 (11:22 +0200)]
docs: Document that i965 Gen6+ requires Kernel 3.6 or later
Cc: "9.2" mesa-stable@lists.freedesktop.org
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Timothy Arceri [Fri, 2 Aug 2013 11:57:50 +0000 (21:57 +1000)]
docs: Update some out of date sourcetree information
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
Christoph Bumiller [Thu, 1 Aug 2013 18:56:21 +0000 (20:56 +0200)]
r600g: honour semantic index in fragment color exports
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Andreas Boll [Fri, 2 Aug 2013 07:58:34 +0000 (09:58 +0200)]
docs: Add md5sums to 9.1.5 release notes
Andreas Boll [Fri, 2 Aug 2013 07:42:03 +0000 (09:42 +0200)]
docs: Fix a typo in the 9.1.6 release notes
Topi Pohjolainen [Mon, 12 Nov 2012 11:38:08 +0000 (13:38 +0200)]
i965: enable image external sampling for imported dma-buffers
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Topi Pohjolainen [Fri, 22 Mar 2013 13:58:05 +0000 (15:58 +0200)]
egl/dri2: support for creating images out of dma buffers
v2:
- upon success close the given file descriptors
v3:
- use specific entry for dma buffers instead of the basic for
primes, and enable the extension based on the availability
of the hook
v4 (Chad):
- use ARRAY_SIZE
- improve the comment about the number of file descriptors
- in case of invalid format report EGL_BAD_ATTRIBUTE instead
of EGL_BAD_MATCH
- take into account specific error set by the driver.
v5:
- fix error handling
v6 (Chad):
- fix invalid plane count checking
v7 (Chad):
- fix indentation and reset loop counter before checking
for excess attributes
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Topi Pohjolainen [Tue, 18 Jun 2013 10:47:43 +0000 (13:47 +0300)]
intel: restrict dma-buf-import images to external sampling only
Memory originating outside mesa stack is meant to be for reading
only. In addition, the restrictions imposed by the image external
extension should apply. For example, users shouldn't be allowed
to generare mip-trees based on these images.
v2 (Chad): document using full extension names, fix the comment
style itself and emit description of error
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Topi Pohjolainen [Fri, 22 Mar 2013 12:31:01 +0000 (14:31 +0200)]
egl: definitions for EXT_image_dma_buf_import
As specified in:
http://www.khronos.org/registry/egl/extensions/EXT/EGL_EXT_image_dma_buf_import.txt
Checking for the valid fourcc values is left for drivers avoiding
dependency to drm header files here.
v2: enforce EGL_NO_CONTEXT
v3: declare the extension as EGL (not GLES)
v4: do not update eglext.h manually but rely on update from
Khronos instead
v5: (Eric) report invalid context as EGL_BAD_PARAMETER instead of as
EGL_BAD_CONTEXT
v6: (Chad) fix the checking for valid hints. Before all values were
rejected.
v7: (Chad) comment style change from
/**
* Multi-
* line
into
/* Multi-
* line
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Topi Pohjolainen [Tue, 26 Mar 2013 13:14:20 +0000 (15:14 +0200)]
dri: propagate extra dma_buf import attributes to the drivers
v2: do not break ABI, but instead introduce new entry point for
dma buffers and bump up the dri-interface version to eight
v3 (Chad): allow the hook to specify an error originating from the
driver. For now only unsupported format is considered.
I thought about rejecting the hints also as they are
addressing only YUV sampling which is not supported at
the moment but then thought against it as the spec is
not saying one way or the other.
v4 (Eric, Chad): restrict to rgb formatted only
v5: rebased on top of i915/i965 split
v6 (Chad): document using full extension name
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Topi Pohjolainen [Wed, 17 Apr 2013 10:11:16 +0000 (13:11 +0300)]
intel: set dri image dimensions even when creating out of primes
Otherwise 'intel_set_texture_image_region()' won't have enough
details to work with.
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Topi Pohjolainen [Fri, 28 Dec 2012 10:22:54 +0000 (12:22 +0200)]
intel: refactor planar format lookup
v2 (Eric): refactor both occurences, not just one
v3 (Chad): replace 0 by NULL
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Topi Pohjolainen [Wed, 27 Mar 2013 13:32:21 +0000 (15:32 +0200)]
intel: do not create renderbuffers out of planar images
v2 (Chad): emit 'GL_INVALID_OPERATION' and description of error
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Topi Pohjolainen [Thu, 25 Apr 2013 11:33:09 +0000 (14:33 +0300)]
intel: allow packed prime buffers to be treated normally
v2:
- fix earlier rebase error breaking bisect
(loaderPriv -> loaderPrivate)
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Mon, 29 Jul 2013 04:48:55 +0000 (21:48 -0700)]
main: Warn that geometry shader support is experimental.
Geometry shader support in the Mesa front end is still fairly
preliminary. Many features are untested, and the following things are
known not to work:
- The gl_in interface block
- The gl_ClipDistance input
- Transform feedback of geometry shader outputs
- Constants that are new in GLSL 1.50 (e.g. gl_MaxGeometryInputComponents)
This isn't a problem, since no back-end drivers currently enable
geometry shaders. However, to make sure no one gets the wrong
impression, emit a nasty warning to let the user know that geometry
shader support isn't complete.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Wed, 31 Jul 2013 04:13:48 +0000 (21:13 -0700)]
glsl: Implement rules for geometry shader input sizes.
Section 4.3.8.1 (Input Layout Qualifiers) of the GLSL 1.50 spec
contains some tricky rules for how the sizes of geometry shader input
arrays are related to the input layout specification. In essence,
those rules boil down to the following:
- If an input array declaration does not specify a size, and it
follows an input layout declaration, it is sized according to the
input layout.
- If an input layout declaration follows an input array declaration
that didn't specify a size, the input array declaration is given a
size at the time the input layout declaration appears.
- All input layout declarations and input array sizes must ultimately
match. Inconsistencies are reported as soon as they are detected,
at compile time if the inconsistency is within one compilation unit,
otherwise at link time.
- At least one compilation unit must contain an input layout
declaration.
(Note: the geom_array_resize_visitor class was contributed by Bryan
Cain <bryancain3@gmail.com>.)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Wed, 24 Jul 2013 21:57:24 +0000 (14:57 -0700)]
glsl: Allow geometry shader input instance arrays to be unsized.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Mon, 22 Jul 2013 18:44:24 +0000 (11:44 -0700)]
glsl: Permit non-ubo input interface arrays to use non-const indexing.
From the GLSL ES 3.00 spec:
"All indexes used to index a uniform block array must be constant
integral expressions."
Similar text exists in GLSL specs since 1.50.
When we implemented this, the only type of interface block supported
by Mesa was uniform blocks, so we required all indexes used to index
any interface block to be constant integral expressions.
Now that we are adding interface block support for GLSL 1.50, we need
a more specific check.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Thu, 13 Jun 2013 01:12:40 +0000 (18:12 -0700)]
glsl: Cross-validate GS layout qualifiers while intrastage linking.
This gets piglit's geometry-basic test running.
TODO: Still need to validate that the GS layout qualifiers don't get used
in places they shouldn't (like an interface block, or a particular shader
input or output)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Thu, 13 Jun 2013 00:21:44 +0000 (17:21 -0700)]
glsl: Export the compiler's GS layout qualifiers to the gl_shader.
Next step is to validate them at link time.
v2 (Paul Berry <stereotype441@gmail.com>): Don't attempt to export the
layout qualifiers in the event of a compile error, since some of them
are set up by ast_to_hir(), and ast_to_hir() isn't guaranteed to have
run in the event of a compile error.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v3 (Paul Berry <stereotype441@gmail.com>): Use PRIM_UNKNOWN to
represent "not set in this shader".
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Eric Anholt [Wed, 12 Jun 2013 21:03:49 +0000 (14:03 -0700)]
glsl: Parse the GLSL 1.50 GS layout qualifiers.
Limited semantic checking (compatibility between declarations, checking
that they're in the right shader target, etc.) is done.
v2: Remove stray debug printfs.
v3 (Paul Berry <stereotype441@gmail.com>): Process input layout
qualifiers at ast_to_hir time rather than at parse time, since certain
error conditions depend on the relative ordering between input layout
qualifiers, declarations, and calls to .length().
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Anholt [Wed, 12 Jun 2013 20:46:57 +0000 (13:46 -0700)]
glsl: Make sure that we don't put too many bitfields in ast_type_qualifier.
We do some tests of qualifiers using a union containing an int and the
struct full of bitfields, so make sure the bitfields don't spill
outside the int.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Mon, 17 Jun 2013 03:18:59 +0000 (06:18 +0300)]
main: Fix delete_shader_cb() for geometry shaders
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fabian Bieler [Fri, 24 May 2013 21:26:54 +0000 (23:26 +0200)]
glsl/linker: Fail to link geometry shader without vertex shader.
From section 2.15 (Geometry Shaders) the OpenGL 3.2 spec:
A program object that includes a geometry shader must also include
a vertex shader; otherwise a link error will occur.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fabian Bieler [Sat, 25 May 2013 10:39:46 +0000 (12:39 +0200)]
mesa: Validate the drawing primitive against the geometry shader input primitive type.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fabian Bieler [Wed, 29 May 2013 22:17:42 +0000 (00:17 +0200)]
mesa/shaderapi: Allow 0 GEOMETRY_VERTICES_OUT.
ARB_geometry_shader4 spec Errors:
"The error INVALID_VALUE is generated by ProgramParameteriARB if <pname>
is GEOMETRY_VERTICES_OUT_ARB and <value> is negative."
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Wed, 10 Apr 2013 23:32:40 +0000 (16:32 -0700)]
glsl: Properly pack GS output varyings
In geometry shaders, outputs are consumed at the time of a call to
EmitVertex() (as opposed to all other shader types, where outputs are
consumed when the shader exits). Therefore, when packing geometry
shader output varyings using lower_packed_varyings, we need to do the
packing at the time of the EmitVertex() call.
This patch accomplishes that by adding a new visitor class,
lower_packed_varyings_gs_splicer, which is responsible for splicing
the varying packing code into place wherever EmitVertex() is found.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Wed, 10 Apr 2013 21:25:13 +0000 (14:25 -0700)]
glsl: Modify varying packing to use a temporary exec_list.
This patch modifies lower_packed_varyings to store the packing code it
generates in a temporary exec_list, and then splice that list into the
shader's main() function when it's done. This paves the way for
supporting geometry shader outputs, where we'll have to splice a clone
of the packing code before every call to EmitVertex().
As a side benefit, varying packing code is now emitted in the same
order for inputs and outputs; this should make debug output a little
easier to read.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Wed, 10 Apr 2013 13:48:42 +0000 (06:48 -0700)]
glsl/linker: Properly pack GS input varyings.
Since geometry shader inputs are arrays (where the array index
indicates which vertex is being examined), varying packing needs to
treat them differently.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Wed, 10 Apr 2013 02:51:41 +0000 (19:51 -0700)]
glsl/linker: Properly error check VS-GS linkage.
From section 4.3.4 (Inputs) of the GLSL 1.50 spec:
Geometry shader input variables get the per-vertex values written
out by vertex shader output variables of the same names. Since a
geometry shader operates on a set of vertices, each input varying
variable (or input block, see interface blocks below) needs to be
declared as an array.
Therefore, the element type of each geometry shader input array should
match the type of the corresponding vertex shader output.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Wed, 10 Apr 2013 14:04:33 +0000 (07:04 -0700)]
glsl: Require geometry shader inputs to be arrays.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Sat, 23 Mar 2013 17:51:53 +0000 (10:51 -0700)]
mesa: Copy linked program data for GS.
The documentation for gl_shader_program.Geom and gl_geometry_program
says that the former is copied to the latter at link time, but this
wasn't happening. This patch causes _mesa_ir_link_shader() to perform
the copy, and updates comment accordingly.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Sat, 23 Mar 2013 17:51:53 +0000 (10:51 -0700)]
mesa: Refactor copying of linked program data.
This patch creates a single function to copy the the UsesClipDistance
flag from gl_shader_program.Vert to gl_vertex_program. Previously
this logic was duplicated in the i965-specific function
brw_link_shader() and the core mesa function _mesa_ir_link_shader().
This logic will have to be expanded to support geometry shaders, and I
don't want to have to update it in two separate places.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bryan Cain [Fri, 15 Feb 2013 15:46:50 +0000 (09:46 -0600)]
glsl: support compilation of geometry shaders
This commit adds all of the parsing and semantics for GLSL 150 style
geometry shaders.
v2 (Paul Berry <stereotype441@gmail.com>): Add a few missing calls to
get_pipeline_stage(). Fix some signed/unsigned comparison warnings.
Fix handling of NULL consumer in assign_varying_locations().
v3 (Bryan Cain <bryancain3@gmail.com>): fix indexing order of 2D
arrays. Also, allow interpolation qualifiers in geometry shaders.
v4 (Paul Berry <stereotype441@gmail.com>): Eliminate
get_pipeline_stage()--it is no longer needed thanks to
030ca23 (mesa:
renumber shader indices according to their placement in pipeline).
Remove 2D stuff. Move vertices_per_prim() to ir.h, so that it will be
accessible from outside the linker. Remove
inject_num_vertices_visitor. Rework for GLSL 1.50.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
v5 (Paul Berry <stereotype441@gmail.com>): Split out
do_set_program_inouts() argument refactoring to a separate patch.
Move geom_array_resizing_visitor to later in the series.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Wed, 31 Jul 2013 05:38:43 +0000 (22:38 -0700)]
glsl/linker: Make separate allocations to track vertex and fragment shaders.
There's no reason to be clever about this. By making separate
allocations for vertex and fragment shaders, we'll allow geometry
shaders to be added without introducing any complication.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bryan Cain [Fri, 15 Feb 2013 14:53:20 +0000 (08:53 -0600)]
glsl: add builtins for geometry shaders.
v2 (Paul Berry <stereotype441@gmail.com>): Account for rework of
builtin_variables.cpp. Use INTERP_QUALIFIER_FLAT for gl_PrimitiveID
so that it will obey provoking vertex conventions. Convert to GLSL
1.50 style geometry shaders.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
v3 (Paul Berry <stereotype441@gmail.com>): Be less obscure about
setting interpolation field of gl_Primitive variables.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bryan Cain [Fri, 15 Feb 2013 15:26:35 +0000 (09:26 -0600)]
glsl: add ir_emit_vertex and ir_end_primitive instruction types
These correspond to the EmitVertex and EndPrimitive functions in GLSL.
v2 (Paul Berry <stereotype441@gmail.com>): Add stub implementations of
new pure visitor functions to i965's vec4_visitor and fs_visitor
classes.
v3 (Paul Berry <stereotype441@gmail.com>): Rename classes to be more
consistent with the names used in the GL spec.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bryan Cain [Fri, 15 Feb 2013 08:40:12 +0000 (02:40 -0600)]
mesa: account for geometry shader texture fetches in update_texture_state
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Sun, 28 Jul 2013 16:23:11 +0000 (09:23 -0700)]
main: Allow for the possibility of GL 3.2 without ARB_geometry_shader4.
Previously, we assumed that the only way Mesa would expose geometry
shader support was via the ARB_geometry_shader4 extension. But this
extension has some extra complications over GL 3.2 (interactions with
compatibility-only features, and link-time initialization of the
constant gl_VerticesIn). So we want to allow for the possibility of
supporting GL 3.2 (with GLSL 1.50 style geometry shaders) even if
ctx->Extensions.ARB_geometry_shader4 is false.
This patch adds a new function, _mesa_has_geometry_shaders(), which
returns true if either ARB_geometry_shader4 is supported or the GL
version is at least 3.2 desktop. Since compute_version() only enables
GL 3.2 functionality when GLSL 1.50 support is present, a sufficient
way for a back-end to advertise geometry shader support is to set
ctx->Const.GLSLVersion >= 150.
v2: Remove unnecessary ctx->Const.GeometryShaders150 constant.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Mon, 24 Jun 2013 15:50:04 +0000 (08:50 -0700)]
main: Fix geometry shader error messages (missing right paren)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Fri, 28 Jun 2013 20:02:23 +0000 (13:02 -0700)]
glsl: Add EXT_texture_array support for geometry shaders.
We can't just use a ".glsl" file since the Lod variants are only
available in vertex and geometry shaders, while the bias variants are
only available in the fragment shader.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Mon, 10 Jun 2013 21:01:45 +0000 (14:01 -0700)]
glsl/linker: Make update_array_sizes apply to just uniforms.
Commit
586b4b5 (glsl: Also update implicit sizes of varyings at link
time) extended update_array_sizes() to apply to both uniforms and
shader ins/outs. However, doing creates problems for geometry
shaders, because update_array_sizes() assumes that variables with
matching names in different parts of the pipeline should have the same
sizes. With the addition of geometry shaders, this is no longer true
(e.g. both vertex and geometry shaders have a gl_ClipDistance output
variable, but there's no reason these variables should have the same
sizes).
The original reason for commit
586b4b5 (avoid problems with
gl_TexCoord being 0 length) has since been addressed by commit
6f53921
(linker: Ensure that unsized arrays have a size after linking). So go
ahead and switch update_array_sizes() back to only acting on uniforms.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Wed, 31 Jul 2013 19:49:49 +0000 (12:49 -0700)]
glsl: Modify ir_set_program_inouts to handle geometry shaders.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Paul Berry [Wed, 31 Jul 2013 18:25:13 +0000 (11:25 -0700)]
glsl: In ir_set_program_inouts, handle indexing outside array/matrix bounds.
According to GLSL, indexing into an array or matrix with an
out-of-range constant results in a compile error. However, indexing
with an out-of-range value that isn't constant merely results in
undefined results.
Since optimization passes (e.g. loop unrolling) can convert
non-constant array indices into constant array indices, it's possible
that ir_set_program_inouts will encounter a constant array index that
is out of range; if this happens, just mark the whole array as used.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Wed, 31 Jul 2013 17:57:22 +0000 (10:57 -0700)]
glsl: Fallback gracefully if ir_set_program_inouts sees unexpected indexing.
The code in ir_set_program_inouts that marks just a portion of a
variable as used (rather than the whole variable) only works on a few
kinds of indexing operations:
- Indexing into matrices
- Indexing into arrays of matrices, vectors, or scalars.
Fortunately these are the only kinds of indexing operations that we
expect to see; everything else is either handled by a
previously-executed lowering pass or prohibited by GLSL.
However, that could conceivably change in the future (the GLSL rules
might change, or we might modify the lowering passes). To avoid
mysterious bugs in the future, let's have ir_set_program_inouts report
an assertion failure if it ever encounters an unexpected kind of
indexing operation (and in release builds, fall back to just marking
the whole variable as used).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Wed, 31 Jul 2013 17:15:49 +0000 (10:15 -0700)]
glsl: Extract marking functions from ir_set_program_inouts.
This patch extracts the functions mark_whole_variable() and
try_mark_partial_variable() from the ir_set_program_inouts visitor
functions. This will make the code easier to follow when we add
geometry shader support.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Wed, 31 Jul 2013 16:43:16 +0000 (09:43 -0700)]
glsl: Use count_attribute_slots() in ir_set_program_inouts.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Wed, 31 Jul 2013 15:24:48 +0000 (08:24 -0700)]
glsl: Expand count_attribute_slots() to cover structs.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Wed, 31 Jul 2013 15:15:08 +0000 (08:15 -0700)]
Move count_attribute_slots() out of the linker and into glsl_type.
Our previous justification for leaving this function out of glsl_type
was that it implemented counting rules that were specific to GLSL
1.50. However, these counting rules also describe the number of
varying slots that Mesa will assign to a varying in the absence of
varying packing. That's useful to be able to compute from outside of
the linker code (a future patch will use it from
ir_set_program_inouts.cpp). So go ahead and move it to glsl_type.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Paul Berry [Wed, 31 Jul 2013 03:49:56 +0000 (20:49 -0700)]
glsl: Change do_set_program_inouts' is_fragment_shader arg to shader_type.
This will allow us to add geometry shader support without having to
add another boolean argument.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Roland Scheidegger [Tue, 30 Jul 2013 19:26:27 +0000 (21:26 +0200)]
gallivm: obey clarified shift behavior
llvm shifts are undefined for shift counts exceeding (or matching) bit width,
so need to apply a mask for the tgsi shift instructions.
v2: only use mask for the tgsi shift instructions, not for the build shift
helpers. None of the internal callers need this behavior, and while llvm can
optimize away the masking for constants there are legitimate cases where it
might not be able to do so even if we know that shift count must be smaller
than type width (currently all such callers do not use the build shift
helpers).
Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Tue, 30 Jul 2013 15:16:17 +0000 (17:16 +0200)]
tgsi: obey clarified shift behavior
c shifts are undefined for shift counts exceeding (or matching) bit width,
so need to apply a mask (on x86 it actually would usually probably work as
shifts do masking on int domain shifts - unless some auto-vectorizer would
come along at last as simd domain does not mask the shift count).
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Tue, 30 Jul 2013 15:08:01 +0000 (17:08 +0200)]
gallium: clarify shift behavior with shift count >= 32
Previously, nothing was said what happens with shift counts exceeding
bit width of the values to shift. In theory 3 behaviors are possible:
1) undefined (classic c definition)
2) just shift out all bits (so result is zero, or -1 potentially for ashr)
3) mask the shift count to bit width - 1
API's either require 3) or are ok with 1). In particular, GLSL (as well as a
couple uninteresting legacy GL extensions) is happy with undefined, whereas
both OpenCL and d3d10 require 3). Consequently, most hw also implements 3).
So, for simplicity we just specify that 3) is required rather than saying
undefined and then needing state trackers to work around it.
Also while here specify shift count as a vector, not scalar. As far as I
can tell this was a doc bug, neither state trackers nor drivers used scalar
shift count.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Carl Worth [Thu, 1 Aug 2013 22:45:04 +0000 (15:45 -0700)]
docs: Add md5sums to 9.1.6 release notes
Carl Worth [Thu, 1 Aug 2013 22:12:25 +0000 (15:12 -0700)]
docs: Import 9.1.6 release notes, add news item.