mesa.git
11 years agoi965/fs: Change brw_wm_prog_data::urb_read_length to num_varying_inputs.
Paul Berry [Tue, 3 Sep 2013 00:35:32 +0000 (17:35 -0700)]
i965/fs: Change brw_wm_prog_data::urb_read_length to num_varying_inputs.

On gen4-5, the FS stage reads varying inputs from URB entries that
were output by the SF thread, where each register stores the
interpolation setup for two components of a vec4, therefore the FS
urb_read_length is twice the number of FS input varyings.  On gen6+,
varying inputs are directly deposited in the FS payload by the SF/SBE
fixed function logic, so urb_read_length is irrelevant.

However, in future patches, it will be nice to be able to consult
brw_wm_prog_data to determine how many varying inputs the FS expects
(rather than inferring it from gl_program::InputsRead).  So instead of
storing urb_read_length, we simply store num_varying_inputs in
brw_wm_prog_data.  On gen4-5, we multiply this by 2 to recover the URB
read length.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/fs: Expose "urb_setup" as part of brw_wm_prog_data.
Paul Berry [Tue, 3 Sep 2013 00:24:19 +0000 (17:24 -0700)]
i965/fs: Expose "urb_setup" as part of brw_wm_prog_data.

At the moment, for Gen6+, the FS assumes that all varying inputs are
delivered to it in the order in which they appear in the
gl_program::InputsRead bitfield, and the SF/SBE setup code ensures
that they are delivered in this order.

When we add support for more than 64 varying components, this will no
longer always be possible, because the Gen6+ SF/SBE stage is only
capable of performing arbitrary reorderings of 16 varying slots.

To allow extra flexibility in the ordering of FS varyings, this patch
causes the FS to advertise exactly what ordering it expects.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoilo: make ilo_bind_sampler_states return void
Chia-I Wu [Fri, 13 Sep 2013 03:34:19 +0000 (11:34 +0800)]
ilo: make ilo_bind_sampler_states return void

So that it can be hooked up pipe_context::bind_sampler_states that is
currently living on another branch.

11 years agoglsl/tests: Update .gitignore for new unit test.
Kenneth Graunke [Mon, 16 Sep 2013 15:25:44 +0000 (08:25 -0700)]
glsl/tests: Update .gitignore for new unit test.

I rarely run 'git status', so I failed to notice this was missing.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl/tests: Add a test for properties of sampler types.
Kenneth Graunke [Thu, 12 Sep 2013 06:57:26 +0000 (23:57 -0700)]
glsl/tests: Add a test for properties of sampler types.

For each sampler type, this tests that:
- The base type is GLSL_TYPE_SAMPLER.
- The dimensionality is set correctly.
- The returned data type is correct.
- The sampler_array and sampler_shadow flags are set correctly.
- sampler_coordinate_components() returns the correct value.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
11 years agost/mesa: don't dereference stObj->pt if NULL
Dave Airlie [Tue, 10 Sep 2013 04:46:23 +0000 (14:46 +1000)]
st/mesa: don't dereference stObj->pt if NULL

It seems a user app can get us into this state, I trigger the fail
running fbo-maxsize inside virgl, it fails to create the backing
storage for the texture object, but then segfaults here when it
should fail the completeness test.

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agonouveau: fix regression since float comparison instructions (v2)
Dave Airlie [Tue, 10 Sep 2013 02:02:30 +0000 (12:02 +1000)]
nouveau: fix regression since float comparison instructions (v2)

Fix the return type and allow src and dst types for comparison
to be separate, this at least fixes the two test cases I've written.

v2: drop the u32->s32 change

Acked-by: Christoph Bumiller <christoph.bumiller@speed.at>
Signed-off-by: Dave Airlie <airlied@redhat.com>
11 years agovdpau/decode: Check max width and max height.
Rico Schüller [Sat, 14 Sep 2013 18:27:07 +0000 (20:27 +0200)]
vdpau/decode: Check max width and max height.

Reviewed-by: Christian König <christian.koenig@amd.com>
11 years agofreedreno: PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE
Rob Clark [Wed, 11 Sep 2013 14:08:08 +0000 (10:08 -0400)]
freedreno: PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE

When the old contents do not need to be preserved, it is faster to
create a new backing bo rather than stall.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agofreedreno/a3xx: fix VFD_INDEX_MAX overflow
Rob Clark [Wed, 11 Sep 2013 14:06:29 +0000 (10:06 -0400)]
freedreno/a3xx: fix VFD_INDEX_MAX overflow

max_index may be 0xffffffff.  The hardware does not need 1 + max_index
(although it does not hurt unless max_index wraps around to zero).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agofreedreno: add debug option to disable GMEM bypass
Rob Clark [Tue, 10 Sep 2013 15:35:58 +0000 (11:35 -0400)]
freedreno: add debug option to disable GMEM bypass

Useful for debugging.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agofreedreno/a3xx: handle front_ccw
Rob Clark [Mon, 9 Sep 2013 15:31:20 +0000 (11:31 -0400)]
freedreno/a3xx: handle front_ccw

Used by supertuxkart.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agofreedreno/a3xx: stencil fixes
Rob Clark [Sun, 8 Sep 2013 21:00:40 +0000 (17:00 -0400)]
freedreno/a3xx: stencil fixes

For mem->gmem we don't sample depth/stencil as it's native type.  So we
need to setup the swizzle state for the sampler based on the format used
for sampling.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agofreedreno/a3xx: alpha-test
Rob Clark [Sun, 8 Sep 2013 17:49:54 +0000 (13:49 -0400)]
freedreno/a3xx: alpha-test

Needed by some games, like etuxracer and supertuxkart which use alpha
test rather than blending, to handle texture transparency.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agofreedreno/a3xx/compiler: implement SUB
Rob Clark [Sat, 7 Sep 2013 23:57:04 +0000 (19:57 -0400)]
freedreno/a3xx/compiler: implement SUB

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agofreedreno/a3xx: use INDIRECT state load for shaders
Rob Clark [Fri, 6 Sep 2013 22:21:25 +0000 (18:21 -0400)]
freedreno/a3xx: use INDIRECT state load for shaders

With a debug option to force DIRECT (mainly to make it easier for
capturing cmdstream dumps).  Using INDIRECT for large shaders at least
makes a noticable reduction in CPU load, which helps for CPU limited
games.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agofreedreno: avoid stalling at ringbuffer wraparound
Rob Clark [Fri, 6 Sep 2013 17:20:46 +0000 (13:20 -0400)]
freedreno: avoid stalling at ringbuffer wraparound

Because of how the tiling works, we can't really flush at arbitrary
points very easily.  So wraparound is handled by resetting to top of
ringbuffer.  Previously this would stall until current rendering is
complete.  Instead cycle through multiple ringbuffers to avoid a stall.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agofreedreno: emit markers to scratch registers
Rob Clark [Fri, 6 Sep 2013 16:47:18 +0000 (12:47 -0400)]
freedreno: emit markers to scratch registers

Emit markers by writing to scratch registers in order to "triangulate"
gpu lockup position from post-mortem register dump.  By comparing
register values in post-mortem dump to command-stream, it is possible to
narrow down which DRAW_INDX caused the lockup.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agofreedreno: split out WFI helper
Rob Clark [Fri, 6 Sep 2013 14:23:14 +0000 (10:23 -0400)]
freedreno: split out WFI helper

Mostly just to give an easy debug/instrumentation point.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agofreedreno: fd_draw helper
Rob Clark [Mon, 2 Sep 2013 11:32:22 +0000 (07:32 -0400)]
freedreno: fd_draw helper

Have a single helper that all draws come through.. mainly for a
convenient debug and instrumentation point.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agofreedreno/a3xx: fix gpu lockup in some piglit tests
Rob Clark [Wed, 4 Sep 2013 02:00:47 +0000 (22:00 -0400)]
freedreno/a3xx: fix gpu lockup in some piglit tests

The varying-out config comes from the inputs of the frag shader (so that
we aren't exporting unneeded varyinges).  The varyings-count should come
from the frag shader as well, to avoid a discrepency in configuration
and resulting gpu lockup.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agofreedreno/a3xx/compiler: add LIT
Rob Clark [Sun, 1 Sep 2013 15:35:56 +0000 (11:35 -0400)]
freedreno/a3xx/compiler: add LIT

Needed by glxgears and etuxracer ;-)

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agofreedreno: multi-slice resources (cubemap, mipmap, etc)
Rob Clark [Sat, 31 Aug 2013 13:14:27 +0000 (09:14 -0400)]
freedreno: multi-slice resources (cubemap, mipmap, etc)

Signed-off-by: Rob Clark <robclark@freedesktop.org>
11 years agoglsl/builtins: Fix {texture1D,texture2D,shadow1D}ArrayLod availibility.
Paul Berry [Thu, 12 Sep 2013 16:11:37 +0000 (09:11 -0700)]
glsl/builtins: Fix {texture1D,texture2D,shadow1D}ArrayLod availibility.

These functions are defined in EXT_texture_array, which makes no
mention of what shader types they should be allowed in.  At the time
EXT_texture_array was introduced, functions ending in "Lod" were
available only in vertex shaders, however this restriction was lifted
in later spec versions and extensions.

We already have the function lod_exists_in_stage() for figuring out
whether functions ending in "Lod" should be available, so just re-use
that.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965: Use brw_stage_state for WM data as well.
Kenneth Graunke [Mon, 2 Sep 2013 00:31:54 +0000 (17:31 -0700)]
i965: Use brw_stage_state for WM data as well.

This gets the VS, GS, and PS all using the same data structure.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoi965: Increase the size of brw_stage_state::surf_offset.
Kenneth Graunke [Mon, 2 Sep 2013 00:18:22 +0000 (17:18 -0700)]
i965: Increase the size of brw_stage_state::surf_offset.

Since BRW_MAX_WM_SURFACES is greater than BRW_MAX_VEC4_SURFACES, the
existing array isn't large enough to be used by the WM.  Increasing it
will make it possible to share them.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoi965: Add comments to the new brw_state_state structure's fields.
Kenneth Graunke [Mon, 2 Sep 2013 00:14:25 +0000 (17:14 -0700)]
i965: Add comments to the new brw_state_state structure's fields.

These are largely based on the similar fields in brw->wm.

v2: Add a better comment than "Scratch buffer".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agomesa: Rename MESA_shader_integer_mix to EXT_shader_integer_mix
Ian Romanick [Thu, 12 Sep 2013 16:40:00 +0000 (11:40 -0500)]
mesa: Rename MESA_shader_integer_mix to EXT_shader_integer_mix

Everyone at the Khronos meeting was as surprised that GLSL didn't
already support this as we were.  Several vendors said they'd ship it,
but there didn't seem to be enough interest to put in the effort to make
it ARB or KHR.

v2: Fix a couple typos and rename the spec file to
EXT_shader_integer_mix.spec.  Suggested by Roland.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agoradeonsi: fix and enable transform feedback for CIK
Marek Olšák [Fri, 6 Sep 2013 19:59:29 +0000 (21:59 +0200)]
radeonsi: fix and enable transform feedback for CIK

The CP_STRMOUT_CNTL register was moved again.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agoradeonsi: fix gl_InstanceID with non-zero start_instance
Marek Olšák [Thu, 5 Sep 2013 13:39:57 +0000 (15:39 +0200)]
radeonsi: fix gl_InstanceID with non-zero start_instance

start_instance doesn't affect gl_InstanceID.

There's no piglit test, but it's kinda obvious the code was wrong.

Reviewed-by: Christian König <christian.koenig@amd.com>
11 years agogallium: comment that INSTANCEID doesn't include start_instance
Marek Olšák [Thu, 5 Sep 2013 13:38:42 +0000 (15:38 +0200)]
gallium: comment that INSTANCEID doesn't include start_instance

Reviewed-by: Christian König <christian.koenig@amd.com>
11 years agoradeonsi: enable streamout AKA transform feedback for SI
Marek Olšák [Sun, 18 Aug 2013 01:05:19 +0000 (03:05 +0200)]
radeonsi: enable streamout AKA transform feedback for SI

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agoradeonsi: implement streamout shader support
Marek Olšák [Sun, 1 Sep 2013 21:59:06 +0000 (23:59 +0200)]
radeonsi: implement streamout shader support

The shader is responsible for writing to streamout buffers using
the TBUFFER_STORE_FORMAT_* instructions.

The locations of some input SGPRs and VGPRs are assigned dynamically, because
the input SGPRs controlling streamout are not declared if they are not needed,
decreasing the indices of all following inputs.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agoradeonsi: implement glDrawTransformFeedback functionality
Marek Olšák [Mon, 26 Aug 2013 16:17:09 +0000 (18:17 +0200)]
radeonsi: implement glDrawTransformFeedback functionality

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agoradeonsi: fix streamout queries
Marek Olšák [Wed, 21 Aug 2013 12:27:17 +0000 (14:27 +0200)]
radeonsi: fix streamout queries

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agoradeonsi: implement streamout flush properly
Marek Olšák [Mon, 2 Sep 2013 10:57:46 +0000 (12:57 +0200)]
radeonsi: implement streamout flush properly

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agoradeonsi: bind streamout buffers to VGT and the vertex shader
Marek Olšák [Sun, 18 Aug 2013 00:34:23 +0000 (02:34 +0200)]
radeonsi: bind streamout buffers to VGT and the vertex shader

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agoradeonsi: handle rasterizer_discard and set GS_OUT_PRIM_TYPE
Marek Olšák [Sun, 18 Aug 2013 01:05:34 +0000 (03:05 +0200)]
radeonsi: handle rasterizer_discard and set GS_OUT_PRIM_TYPE

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agoradeonsi: initialize the first CS like any other
Marek Olšák [Fri, 30 Aug 2013 22:13:43 +0000 (00:13 +0200)]
radeonsi: initialize the first CS like any other

So that the "init" state is always emitted first and not later in draw_vbo.

This fixes streamout where the "init" state, which disables streamout,
was emitted in draw_vbo after streamout was enabled.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agoradeonsi: integrate shared streamout state
Marek Olšák [Tue, 13 Aug 2013 23:52:38 +0000 (01:52 +0200)]
radeonsi: integrate shared streamout state

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agoradeon: don't emit streamout state if there are no streamout buffers
Marek Olšák [Sun, 1 Sep 2013 21:00:28 +0000 (23:00 +0200)]
radeon: don't emit streamout state if there are no streamout buffers

This could happen if set_stream_output_targets is called twice
in a row without a draw call in between.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agoradeon: don't emit VGT_STRMOUT_BUFFER_BASE on SI
Marek Olšák [Sat, 31 Aug 2013 00:32:22 +0000 (02:32 +0200)]
radeon: don't emit VGT_STRMOUT_BUFFER_BASE on SI

The register doesn't exist on SI.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
11 years agomesa: Disallow relinking if a program is used by an active XFB object.
Kenneth Graunke [Fri, 6 Sep 2013 22:41:19 +0000 (15:41 -0700)]
mesa: Disallow relinking if a program is used by an active XFB object.

Paused transform feedback objects may refer to a program other than the
current program.  If any active objects refer to a program, LinkProgram
must reject the request to relink.

The code to detect this is ugly since _mesa_HashWalk is awkward to use,
but unfortunately we can't use hash_table_foreach since there's no way
to get at the underlying struct hash_table (and even then, we'd need to
handle locking somehow).

Fixes the last subcase of Piglit's new ARB_transform_feedback2
api-errors test.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
11 years agomesa: Reject ResumeTransformFeedback if the wrong program is bound.
Kenneth Graunke [Fri, 6 Sep 2013 21:51:26 +0000 (14:51 -0700)]
mesa: Reject ResumeTransformFeedback if the wrong program is bound.

This is actually a pretty important error condition: otherwise, you
could set up transform feedback with one program, and resume it with
a program that generates a completely different set of outputs.

Fixes a subcase of Piglit's new ARB_transform_feedback2 api-errors test.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
11 years agomesa: Track the vertex program active at BeginTransformFeedback() time.
Kenneth Graunke [Fri, 6 Sep 2013 21:47:19 +0000 (14:47 -0700)]
mesa: Track the vertex program active at BeginTransformFeedback() time.

The next few patches will use this for API error checking.

All of the drivers appear to CALLOC_STRUCT transform feedback objects,
so this should be properly NULL initialized on creation.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
11 years agomesa: Disallow TransformFeedbackVaryings when active.
Kenneth Graunke [Fri, 6 Sep 2013 19:38:12 +0000 (12:38 -0700)]
mesa: Disallow TransformFeedbackVaryings when active.

Fixes a subcase of Piglit's new ARB_transform_feedback2 api-errors test.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
11 years agoradeon/uvd: move more logic into the common files
Christian König [Mon, 9 Sep 2013 08:49:55 +0000 (10:49 +0200)]
radeon/uvd: move more logic into the common files

Move the code back into the common UVD files since we now
have base structures for R600 and radeonsi.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agoradeon/uvd: use more sane defaults for bitstream buffer size
Christian König [Sat, 7 Sep 2013 17:40:34 +0000 (11:40 -0600)]
radeon/uvd: use more sane defaults for bitstream buffer size

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
11 years agoos: First check for __GLIBC__ and then for PIPE_OS_BSD
Andreas Boll [Wed, 11 Sep 2013 12:27:08 +0000 (14:27 +0200)]
os: First check for __GLIBC__ and then for PIPE_OS_BSD

Fixes FTBFS on kfreebsd-*

Debian GNU/kFreeBSD doesn't provide getprogname() since it uses stdlib.h
from glibc. Instead it provides program_invocation_short_name from glibc.

You can find the same order in src/mesa/drivers/dri/common/xmlconfig.c

Cc: "9.2" <mesa-stable@lists.freedesktop.org>
Tested-by: Julien Cristau <jcristau@debian.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agollvmpipe: Remove the special path for TGSI_OPCODE_EXP.
José Fonseca [Wed, 11 Sep 2013 11:04:29 +0000 (12:04 +0100)]
llvmpipe: Remove the special path for TGSI_OPCODE_EXP.

It was wrong for EXP.y, as we clamped the source before computing the
fractional part, and this opcode should be rarely used, so it's not
worth the hassle.

11 years agotrace: Several enhancements to dump_state.py
José Fonseca [Wed, 4 Sep 2013 17:10:35 +0000 (18:10 +0100)]
trace: Several enhancements to dump_state.py

- Handle more calls
- Handle more state
- Try to normalize the output a bit, to eliminate spurious differences

11 years agotrace: Support bigger TGSI shaders.
José Fonseca [Wed, 4 Sep 2013 13:54:31 +0000 (14:54 +0100)]
trace: Support bigger TGSI shaders.

Trivial.

11 years agoglsl: Use sampler_coordinate_components instead of passing it by hand.
Kenneth Graunke [Wed, 11 Sep 2013 18:20:36 +0000 (11:20 -0700)]
glsl: Use sampler_coordinate_components instead of passing it by hand.

We used to pass the number of components actually used for the
coordinate (rather than padding, shadow comparitors, and projectors) by
hand, specifying it on every _texture() call.

The new helper function can just compute this, eliminating a lot of
potential mistakes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl: Add a new glsl_type::sampler_coordinate_components() function.
Kenneth Graunke [Wed, 11 Sep 2013 18:14:14 +0000 (11:14 -0700)]
glsl: Add a new glsl_type::sampler_coordinate_components() function.

This computes the number of components necessary to address a sampler
based on its dimensionality.  It will be useful for texturing built-ins.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoMove nv30, nv50 and nvc0 to nouveau.
Johannes Obermayr [Tue, 20 Aug 2013 18:14:00 +0000 (20:14 +0200)]
Move nv30, nv50 and nvc0 to nouveau.

It is planned to ship openSUSE 13.1 with -shared libs.
nouveau.la, nv30.la, nv50.la and nvc0.la are currently LIBADDs in all nouveau
related targets.
This change makes it possible to easily build one shared libnouveau.so which is
then LIBADDed.
Also dlopen will be faster for one library instead of three and build time on
-jX will be reduced.

Whitespace fixes were requested by 'git am'.

Signed-off-by: Johannes Obermayr <johannesobermayr@gmx.de>
Acked-by: Christoph Bumiller <christoph.bumiller@speed.at>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoi965/gs: implement EndPrimitive() functionality in the visitor.
Paul Berry [Sun, 21 Apr 2013 15:51:33 +0000 (08:51 -0700)]
i965/gs: implement EndPrimitive() functionality in the visitor.

According to GLSL, the shader may call EndPrimitive() at any point
during its execution, causing the line or triangle strip currently
being output to be terminated and a new strip to be begun.

This is implemented in gen7 hardware by using one control data bit per
vertex, to indicate whether EndPrimitive() was called after that
vertex was emitted.

In order to make this work without sacrificing too much efficiency, we
accumulate 32 control data bits at a time in a GRF.  When we have
accumulated 32 bits (or when the shader terminates), we output them to
the appropriate DWORD in the control data header and reset the
accumulator to 0.

We have to take special care to make sure that EndPrimitive() calls
that occur prior to the first vertex have no effect.

Since geometry shaders that output a large number of vertices are
likely to be rare, an optimization kicks in if max_vertices <= 32.  In
this case, we know that we can wait until the end of shader execution
before any control data bits need to be output.

I've tried to write the code in such a way that in the future, we can
easily adapt it to output stream ID bits (which are two bits/vertex
instead of one).

Fixes piglit tests "spec/glsl-1.50/glsl-1.50-geometry-end-primitive *".

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/vec4: Add the ability to emit opcodes with just a dst register.
Paul Berry [Sun, 21 Apr 2013 15:51:33 +0000 (08:51 -0700)]
i965/vec4: Add the ability to emit opcodes with just a dst register.

This is needed for GS_OPCODE_PREPARE_CHANNEL_MASKS.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/gs: Add opcodes needed for EndPrimitive().
Paul Berry [Sun, 21 Apr 2013 15:51:33 +0000 (08:51 -0700)]
i965/gs: Add opcodes needed for EndPrimitive().

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/gen7: Add the ability to send URB_WRITE_OWORD messages.
Paul Berry [Mon, 12 Aug 2013 03:29:34 +0000 (20:29 -0700)]
i965/gen7: Add the ability to send URB_WRITE_OWORD messages.

Previously, brw_urb_WRITE() would always generate a URB_WRITE_HWORD
message, we always wanted to write data to the URB in pairs of varying
slots or larger (an HWORD is 32 bytes, which is 2 varying slots).

In order to support geometry shader EndPrimitive functionality, we'll
need the ability to write to just a single OWORD (16 byte) slot, since
we'll only be outputting 32 of the control data bits at a time.  So
this patch adds a flag that will cause brw_urb_WRITE to generate a
URB_WRITE_OWORD message.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/gen7: Allow URB_WRITE channel masks to be used.
Paul Berry [Sun, 11 Aug 2013 04:57:59 +0000 (21:57 -0700)]
i965/gen7: Allow URB_WRITE channel masks to be used.

Previously, brw_urb_WRITE() would unconditionally override the channel
masks in the URB_WRITE message to 0xff (indicating that all channels
should be written to the URB).

In order to support geometry shader EndPrimitive functionality, we'll
need the ability to set the channel masks programatically, so that we
can output just 32 of the control data bits at a time.  So this patch
adds a flag that will prevent brw_urb_WRITE() from overriding them.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/gs: Set control data header size/format appropriately for EndPrimitive().
Paul Berry [Mon, 19 Aug 2013 04:18:19 +0000 (21:18 -0700)]
i965/gs: Set control data header size/format appropriately for EndPrimitive().

The gen7 geometry shader uses a "control data header" at the beginning
of the output URB entry to store either

(a) flag bits (1 bit/vertex) indicating whether EndPrimitive() was
    called after each vertex, or

(b) stream ID bits (2 bits/vertex) indicating which stream each vertex
    should be sent to (when multiple transform feedback streams are in
    use).

Fortunately, OpenGL only requires separate streams to be supported
when the output type is points, and EndPrimitive() only has an effect
when the output type is line_strip or triangle_strip, so it's not a
problem that these two uses of the control data header are mutually
exclusive.

This patch modifies do_vec4_gs_prog() to determine the correct
hardware settings for configuring the control data header, and
modifies upload_gs_state() to propagate these settings to the
hardware.

In addition, it modifies do_vec4_gs_prog() to ensure that the output
URB entry is large enough to contain both the output vertices *and*
the control data header.

Finally, it modifies vec4_gs_visitor so that it accounts for the size
of the control data header when computing the offset within the URB
where output vertex data should be stored.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
v2: Fixed incorrect handling of IVB/HSW differences.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl: During linking, record whether a GS uses EndPrimitive().
Paul Berry [Mon, 19 Aug 2013 03:59:37 +0000 (20:59 -0700)]
glsl: During linking, record whether a GS uses EndPrimitive().

This information will be useful in the i965 back end, since we can
save some compilation effort if we know from the outset that the
shader never calls EndPrimitive().

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/gs: Add a state atom to set up geometry shader state.
Paul Berry [Wed, 27 Mar 2013 20:21:36 +0000 (13:21 -0700)]
i965/gs: Add a state atom to set up geometry shader state.

v2: Do not attempt to share the code that uploads
3DSTATE_BINDING_TABLE_POINTERS_GS, 3DSTATE_SAMPLER_STATE_POINTERS_GS,
or 3DSTATE_GS with VS.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
v3: Add _NEW_TRANSFORM to gen7_gs_state.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoi965/gen7: Extract a function for setting up a shader stage's constants.
Paul Berry [Mon, 9 Sep 2013 14:28:17 +0000 (07:28 -0700)]
i965/gen7: Extract a function for setting up a shader stage's constants.

This will allow us to reuse some code when setting up the geometry
shader stage.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agowayland-egl.pc requires wayland-client.pc.
Torsten Duwe [Tue, 10 Sep 2013 21:36:48 +0000 (23:36 +0200)]
wayland-egl.pc requires wayland-client.pc.

Mesa provides the wayland-egl libs and the pkgconfig file, but the headers
originate from the wayland package. Ensure everything matches, by requiring
application builds to look at the wayland headers as well.

Signed-off-by: Torsten Duwe <duwe@suse.de>
Signed-off-by: Johannes Obermayr <johannesobermayr@gmx.de>
11 years agost/gbm: Add $(WAYLAND_CFLAGS) for HAVE_EGL_PLATFORM_WAYLAND.
Johannes Obermayr [Tue, 10 Sep 2013 21:36:47 +0000 (23:36 +0200)]
st/gbm: Add $(WAYLAND_CFLAGS) for HAVE_EGL_PLATFORM_WAYLAND.

11 years agost/dri: do not create a new context for msaa copy
Maarten Lankhorst [Mon, 9 Sep 2013 11:02:08 +0000 (13:02 +0200)]
st/dri: do not create a new context for msaa copy

Commit b77316ad7594f
    st/dri: always copy new DRI front and back buffers to corresponding MSAA buffers

introduced creating a pipe_context for every call to validate, which is not required
because the callers have a context anyway.

Only exception is egl_g3d_create_pbuffer_from_client_buffer, can someone test if it
still works with NULL passed as context for validate? From examining the code I
believe it does, but I didn't thoroughly test it.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Cc: 9.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
11 years agoi965: Add an assertion that writemask != NULL for non-ARFs.
Kenneth Graunke [Mon, 9 Sep 2013 22:40:22 +0000 (15:40 -0700)]
i965: Add an assertion that writemask != NULL for non-ARFs.

We've observed GPU hangs on Ivybridge from the following instruction:

mov(8) g115<1>.F 0D { align16 WE_normal NoDDChk 1Q };

There should be no reason to ever set the writemask on a destination
register to zero, except for perhaps the ARF NULL register.

This patch adds an assertion to enforce this for non-ARF registers.
Excluding ARFs is conservative yet should still catch the majority
of mistakes.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
11 years agoi965/vec4: Only zero out unused message components when there are any.
Kenneth Graunke [Mon, 9 Sep 2013 18:11:03 +0000 (11:11 -0700)]
i965/vec4: Only zero out unused message components when there are any.

Otherwise, coordinates with four components would result in a MOV
with a destination writemask that has no channels enabled:

mov(8) g115<1>.F 0D { align16 WE_normal NoDDChk 1Q };

At best, this is stupid: we emit code that shouldn't do anything.
Worse, it apparently causes GPU hangs (observable with Chris's
textureGather test on CubeArrays.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
Cc: mesa-stable@lists.freedesktop.org
11 years agoi965/vec4: Simplify the computation of coord_mask and zero_mask.
Kenneth Graunke [Mon, 9 Sep 2013 22:36:59 +0000 (15:36 -0700)]
i965/vec4: Simplify the computation of coord_mask and zero_mask.

We can easily compute these without loops, resulting in simpler and
shorter code.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
11 years agodocs: Clean up autoconf.html.
Matt Turner [Mon, 9 Sep 2013 23:27:18 +0000 (16:27 -0700)]
docs: Clean up autoconf.html.

Remove long dead options and clarify some things.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69148
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agomesa: Properly set the fog scale (gl_Fog.scale) to +INF when fog start and end are...
Henri Verbeet [Sat, 31 Aug 2013 09:50:16 +0000 (11:50 +0200)]
mesa: Properly set the fog scale (gl_Fog.scale) to +INF when fog start and end are equal.

This was originally introduced by commit
ba47aabc9868b410cdfe3bc8b6d25a44a598cba2, but unfortunately the commit message
doesn't go into much detail about why +INF would be a problem here.

A similar issue exists for STATE_FOG_PARAMS_OPTIMIZED, but allowing infinity
there would potentially introduce NaNs where they shouldn't exist, depending
on the values of fog end and the fog coord. Since STATE_FOG_PARAMS_OPTIMIZED
is only used for fixed function (including ARB_fragment_program with fog
option), and the calculation there probably isn't very stable to begin with
when fog start and end are close together, it seems best to just leave it
alone.

This fixes piglit glsl-fs-fogscale, and a couple of Wine D3D tests. No piglit
regressions on Cayman.

Signed-off-by: Henri Verbeet <hverbeet@gmail.com>
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agomesa: Use correct enum conversion function.
Vinson Lee [Tue, 10 Sep 2013 01:53:50 +0000 (18:53 -0700)]
mesa: Use correct enum conversion function.

Fixes "Mixing enum types" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agomesa: Ensure gl_sync_object is fully initialized.
Vinson Lee [Tue, 10 Sep 2013 00:28:35 +0000 (17:28 -0700)]
mesa: Ensure gl_sync_object is fully initialized.

278372b47e4db8a022d57f60302eec74819e9341 added the uninitialized pointer
field gl_sync_object:Label. A free of this pointer, added in commit
6d8dd59cf53d2f47b817d79204a52bb3a46e8c77, resulted in a crash.

This patch fixes piglit ARB_sync regressions with swrast introduced by
6d8dd59cf53d2f47b817d79204a52bb3a46e8c77.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
11 years agoradeonsi: Add parentheses around '|' operands.
Vinson Lee [Tue, 10 Sep 2013 03:14:28 +0000 (20:14 -0700)]
radeonsi: Add parentheses around '|' operands.

Fixes GCC parentheses warning.

r600_texture.c: In function 'si_texture_create':
r600_texture.c:518:20: warning: suggest parentheses around arithmetic in operand of '|' [-Wparentheses]
      !(templ->bind & PIPE_BIND_CURSOR | PIPE_BIND_LINEAR)) {
                    ^

Fixes "Wrong operator used" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
11 years agoutil: Fix unmatched parenthesis.
Vinson Lee [Tue, 10 Sep 2013 17:31:29 +0000 (10:31 -0700)]
util: Fix unmatched parenthesis.

Fixes MSVC build error introduced with commit
923d3467147dd301d94ed3e6b41295fb2bcd6f47.

src\gallium\auxiliary\util\u_cpu_detect.c(286) : fatal error C1012: unmatched parenthesis : missing '('

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
11 years agoutil: don't use _fxsave() with MSVC 2010 or older
Brian Paul [Tue, 10 Sep 2013 15:20:34 +0000 (09:20 -0600)]
util: don't use _fxsave() with MSVC 2010 or older

And update _MSC_VER comments in p_config.h

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
11 years agoglsl: Add missing va_end in builtin_builder::add_function.
Vinson Lee [Tue, 10 Sep 2013 03:25:55 +0000 (20:25 -0700)]
glsl: Add missing va_end in builtin_builder::add_function.

Fixes "Missing varargs init or cleanup" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl: Initialize builtin_builder member variables.
Vinson Lee [Tue, 10 Sep 2013 03:45:28 +0000 (20:45 -0700)]
glsl: Initialize builtin_builder member variables.

Fixes "Uninitialized pointer field" defect reported by Coverity.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl: fix variadic macro for MSVC
Brian Paul [Mon, 9 Sep 2013 23:02:52 +0000 (17:02 -0600)]
glsl: fix variadic macro for MSVC

MSVC doesn't accept the rest... syntax.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoglsl: remove struct keyword from ir_variable declarations
Brian Paul [Mon, 9 Sep 2013 23:02:19 +0000 (17:02 -0600)]
glsl: remove struct keyword from ir_variable declarations

To silence MSVC warnings.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
11 years agoRevert "i965/vec4: Only zero out unused message components when there are any."
Kenneth Graunke [Mon, 9 Sep 2013 22:32:26 +0000 (15:32 -0700)]
Revert "i965/vec4: Only zero out unused message components when there are any."

This reverts commit 6c3db2167c64ecf2366862f15f8e2d4a91f1028c, which I
accidentally pushed along with other code.  A better version of the fix
will be committed later.

11 years agoi965: Allow immediates to be folded into logical and shift instructions.
Matt Turner [Mon, 5 Aug 2013 22:17:04 +0000 (15:17 -0700)]
i965: Allow immediates to be folded into logical and shift instructions.

These instructions will be used with immediate arguments in the upcoming
ldexp lowering pass and frexp implementation.

v2: Add vec4 support as well.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoi965: Enable MESA_shader_integer_mix.
Matt Turner [Sat, 7 Sep 2013 00:53:39 +0000 (17:53 -0700)]
i965: Enable MESA_shader_integer_mix.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoglsl: Implement MESA_shader_integer_mix extension.
Matt Turner [Thu, 29 Aug 2013 01:01:39 +0000 (18:01 -0700)]
glsl: Implement MESA_shader_integer_mix extension.

Because why doesn't GLSL allow you to do this already?

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoglsl: Use conditional-select in mix().
Matt Turner [Fri, 6 Sep 2013 19:36:48 +0000 (12:36 -0700)]
glsl: Use conditional-select in mix().

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoi965: Add support for ir_triop_csel.
Matt Turner [Mon, 19 Aug 2013 17:44:41 +0000 (10:44 -0700)]
i965: Add support for ir_triop_csel.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoglsl: Add conditional-select IR.
Matt Turner [Mon, 19 Aug 2013 17:45:46 +0000 (10:45 -0700)]
glsl: Add conditional-select IR.

It's a ?: that operates per-component on vectors. Will be used in
upcoming lowering pass for ldexp and the implementation of frexp.

 csel(selector, a, b):
   per-component result = selector ? a : b

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
11 years agoglsl: Rename ir_function_signature::builtin_info to builtin_avail.
Kenneth Graunke [Mon, 9 Sep 2013 21:53:22 +0000 (14:53 -0700)]
glsl: Rename ir_function_signature::builtin_info to builtin_avail.

builtin_info was originally going to be a structure containing a bunch
of information, but after various rewrites, it turned into a boolean
availability predicate.

builtin_avail is a better name than builtin_info, since it doesn't
store any information other than availability.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agobuild: Delete cross-compiling macros.
Kenneth Graunke [Fri, 6 Sep 2013 00:10:54 +0000 (17:10 -0700)]
build: Delete cross-compiling macros.

Now that builtin_compiler is gone, nothing uses these.

Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
11 years agoglsl: Add missing type inference for ir_binop_bfm.
Kenneth Graunke [Thu, 5 Sep 2013 23:57:29 +0000 (16:57 -0700)]
glsl: Add missing type inference for ir_binop_bfm.

Matt noticed that this was missing.  Nothing uses this currently.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoglsl: Delete old built-in function generation code.
Kenneth Graunke [Wed, 4 Sep 2013 04:23:18 +0000 (21:23 -0700)]
glsl: Delete old built-in function generation code.

None of this is used anymore.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
11 years agoglsl: Remove builtin_compiler from the build system.
Kenneth Graunke [Wed, 4 Sep 2013 04:22:17 +0000 (21:22 -0700)]
glsl: Remove builtin_compiler from the build system.

We don't actually use anything from builtin_function.cpp, so we don't
need to generate it anymore.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Paul Berry <stereotype441@gmail.com>
11 years agoglsl: Switch to the new built-in function module.
Kenneth Graunke [Mon, 2 Sep 2013 03:48:45 +0000 (20:48 -0700)]
glsl: Switch to the new built-in function module.

All built-ins are now handled by the new code; the old system is dead.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoglsl: Write a new built-in function module.
Kenneth Graunke [Fri, 30 Aug 2013 06:06:39 +0000 (23:06 -0700)]
glsl: Write a new built-in function module.

This creates a new replacement for the existing built-in function code.
The new module lives in builtin_functions.cpp (not builtin_function.cpp)
and exists in parallel with the existing system.  It isn't used yet.

The new built-in function code takes a significantly different approach:

Instead of implementing built-ins via printed IR, build time scripts,
and run time parsing, we now implement them directly in C++, using
ir_builder.  This translates to faster load times, and a much less
complex build system.

It also takes a different approach to built-in availability: each
signature now stores a boolean predicate, which makes it easy to
construct arbitrary expressions based on _mesa_glsl_parse_state's
fields.  This is much more flexible than the old system, and also
easier to use.

Built-ins are also now stored in a single gl_shader object, rather
than being spread out across a number of shaders that need to be linked.
When searching for a matching prototype, we simply consult the
availability predicate.  This also simplifies the code.

v2: Incorporate Matt Turner's feedback: use the new fma() function rather
    than expr().  Don't expose textureQueryLOD() in GLSL 4.00 (since it
    was renamed to textureQueryLod()).  Also correct some #undefs.
v3: Incorporate Paul Berry's feedback: rename legacy to compatibility;
    add comments to explain a few things; fix uvec availability; include
    shaderobj.h instead of repeating the _mesa_new_shader prototype.
v4: Fix lack of TEX_PROJECT on textureProjGrad[Offset] (caught by oglc).
    Add an out_var convenience function (more feedback by Matt Turner).
v5: Rework availability predicates for Lod functions.  They were broken.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Enthusiastically-acked-by: Paul Berry <stereotype441@gmail.com>
11 years agoglsl: Add optional parameters to the ir_factory constructor.
Kenneth Graunke [Wed, 4 Sep 2013 00:07:18 +0000 (17:07 -0700)]
glsl: Add optional parameters to the ir_factory constructor.

Each ir_factory needs an instruction list and memory context in order to
be useful.  Rather than creating an object and manually assigning these,
we can just use optional parameters in the constructor.

This makes it possible to create a ready-to-use factory in one line:

   ir_factory body(&sig->body, mem_ctx);

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoglsl: Add IR builder shortcuts for a bunch of random opcodes.
Kenneth Graunke [Wed, 4 Sep 2013 00:02:07 +0000 (17:02 -0700)]
glsl: Add IR builder shortcuts for a bunch of random opcodes.

Adding new convenience emitters makes it easier to generate IR involving
these opcodes.

bitfield_insert is particularly useful, since there is no expr() for
quadops.

v2: Add fma() and rename lrp() operands to x/y/a to match the GLSL
    specification (suggested by Matt Turner).  Fix whitespace issues.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoglsl: Expose IR builder support for arbitrary swizzling.
Kenneth Graunke [Tue, 3 Sep 2013 23:55:37 +0000 (16:55 -0700)]
glsl: Expose IR builder support for arbitrary swizzling.

IR builder already offers a lot of swizzling functions, such as
swizzle_xxxx, swizzle_z, or swizzle_for_size.

The swizzle_xxxx style is convenient if you statically know which
components you want.  swizzle_for_size is great if you want to select
the first few components.  However, if you want to select components
based on, say, a loop counter, none of those are sufficient.

IR builder actually already had support for arbitrary swizzling, but
didn't expose it.  This patch exposes that API.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoglsl: Add a new ir_builder::dotlike() function.
Kenneth Graunke [Tue, 3 Sep 2013 23:46:05 +0000 (16:46 -0700)]
glsl: Add a new ir_builder::dotlike() function.

dotlike() uses ir_binop_mul for scalars, and ir_binop_dot for vectors.

When generating built-in functions, we often want to use regular
multiply for scalar signatures, and dot() for vector signatures.
ir_binop_dot only works on vectors, so we have to switch opcodes,
even if the code is otherwise identical.  dotlike() makes this easy.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
11 years agoglsl: Add IR builder support for generating return statements.
Kenneth Graunke [Tue, 3 Sep 2013 23:44:25 +0000 (16:44 -0700)]
glsl: Add IR builder support for generating return statements.

We use "ret" as the function name since "return" is a C++ keyword, and
"ir_return" is already a class name.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>