mesa.git
10 years agoradeonsi: Rename r600->si remaining identifier in si_hw_context.c.
Andreas Hartmetz [Tue, 7 Jan 2014 01:59:28 +0000 (02:59 +0100)]
radeonsi: Rename r600->si remaining identifier in si_hw_context.c.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agoradeonsi: Rename radeonsi->si remaining identifiers in si_compute.c.
Andreas Hartmetz [Tue, 7 Jan 2014 01:53:26 +0000 (02:53 +0100)]
radeonsi: Rename radeonsi->si remaining identifiers in si_compute.c.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agoradeonsi: Rename r600->si remaining identifiers in si_blit.c.
Andreas Hartmetz [Tue, 7 Jan 2014 01:51:35 +0000 (02:51 +0100)]
radeonsi: Rename r600->si remaining identifiers in si_blit.c.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agoradeonsi: Rename r600->si for functions in si_pipe.h.
Andreas Hartmetz [Tue, 7 Jan 2014 01:40:22 +0000 (02:40 +0100)]
radeonsi: Rename r600->si for functions in si_pipe.h.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agoradeonsi: Rename r600->si for functions in si.h.
Andreas Hartmetz [Tue, 7 Jan 2014 01:14:42 +0000 (02:14 +0100)]
radeonsi: Rename r600->si for functions in si.h.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agoradeonsi: Rename r600->si for functions in si_resource.h.
Andreas Hartmetz [Tue, 7 Jan 2014 01:05:57 +0000 (02:05 +0100)]
radeonsi: Rename r600->si for functions in si_resource.h.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agoradeonsi: Rename r600->si for structs in si_resource.h.
Andreas Hartmetz [Tue, 7 Jan 2014 00:55:08 +0000 (01:55 +0100)]
radeonsi: Rename r600->si for structs in si_resource.h.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agoradeonsi: Rename r600->si for structs in si.h.
Andreas Hartmetz [Tue, 7 Jan 2014 00:51:30 +0000 (01:51 +0100)]
radeonsi: Rename r600->si for structs in si.h.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agoradeonsi: Rename r600->si for structs in si_pipe.h.
Andreas Hartmetz [Sat, 11 Jan 2014 14:47:07 +0000 (15:47 +0100)]
radeonsi: Rename r600->si for structs in si_pipe.h.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agoradeonsi: Apply si_* file naming scheme.
Andreas Hartmetz [Sat, 4 Jan 2014 17:44:33 +0000 (18:44 +0100)]
radeonsi: Apply si_* file naming scheme.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
10 years agoUse AC_PATH_TOOL instead of AC_PATH_PROG for llvm-config.
Michał Górny [Sat, 28 Dec 2013 14:22:09 +0000 (15:22 +0100)]
Use AC_PATH_TOOL instead of AC_PATH_PROG for llvm-config.

This should help with cross-compiling and multilib when $CHOST-specific
llvm-config is expected rather than build host default one.

It will help us a bit in Gentoo where we've started using
i686-pc-linux-gnu-llvm-config for 32-bit multilib LLVM.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Michał Górny <mgorny@gentoo.org>
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=73100
CC: "10.0" <mesa-stable@lists.freedesktop.org>
10 years agoconfigure: Disable xvmc by default
Tom Stellard [Mon, 6 Jan 2014 02:49:03 +0000 (18:49 -0800)]
configure: Disable xvmc by default

The xvmc unit tests are failing on r300g and r600g.

Reviewed-by: Vinson Lee <vlee@freedesktop.org>
10 years agoglsl: Remove exec_list iterators now that nothing uses them.
Kenneth Graunke [Fri, 22 Nov 2013 11:42:06 +0000 (03:42 -0800)]
glsl: Remove exec_list iterators now that nothing uses them.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: Replace iterators in ir_reader.cpp with ad-hoc list walking.
Kenneth Graunke [Sat, 11 Jan 2014 01:08:33 +0000 (17:08 -0800)]
glsl: Replace iterators in ir_reader.cpp with ad-hoc list walking.

These can't use foreach_list since they want to skip over the first few
list elements.  Just doing the ad-hoc list walking isn't too bad.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: Use a new foreach_two_lists macro for walking two lists at once.
Kenneth Graunke [Sat, 11 Jan 2014 00:39:17 +0000 (16:39 -0800)]
glsl: Use a new foreach_two_lists macro for walking two lists at once.

When handling function calls, we often want to walk through the list of
formal parameters and list of actual parameters at the same time.
(Both are guaranteed to be the same length.)

Previously, we used a pattern of:

   exec_list_iterator 1st_iter = <1st list>.iterator();
   foreach_iter(exec_list_iterator, 2nd_iter, <2nd list>) {
      ...
      1st_iter.next();
   }

This was awkward, since you had to manually iterate through one of
the two lists.

This patch introduces a foreach_two_lists macro which safely walks
through two lists at the same time, so you can simply do:

   foreach_two_lists(1st_node, <1st list>, 2nd_node, <2nd list>) {
      ...
   }

v2: Rename macro from foreach_list2 to foreach_two_lists, as suggested
    by Ian Romanick.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: Statically cast parameter exec_node to ir_variable.
Kenneth Graunke [Sat, 11 Jan 2014 00:13:54 +0000 (16:13 -0800)]
glsl: Statically cast parameter exec_node to ir_variable.

Formal function parameters are always ir_variable objects, not an
arbitrary ir_instruction.  So there's no need to dynamically cast here.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: Cast ir_call parameters to ir_rvalue, not ir_instruction.
Kenneth Graunke [Sat, 23 Nov 2013 17:51:52 +0000 (09:51 -0800)]
glsl: Cast ir_call parameters to ir_rvalue, not ir_instruction.

A function call's parameters are always rvalues.  ir_rvalue may not
always be a subclass of ir_instruction in the future, so we should use
the right one.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: Replace foreach_iter and iter.remove() with foreach_list_safe.
Kenneth Graunke [Sat, 11 Jan 2014 00:46:26 +0000 (16:46 -0800)]
glsl: Replace foreach_iter and iter.remove() with foreach_list_safe.

foreach_list_safe allows you to safely remove the current node.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: Convert piles of foreach_iter to foreach_list_safe.
Kenneth Graunke [Fri, 22 Nov 2013 10:10:15 +0000 (02:10 -0800)]
glsl: Convert piles of foreach_iter to foreach_list_safe.

In these cases, we edit the list (or at least might be), so we use the
foreach_list_safe variant.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoglsl: Convert piles of foreach_iter to the newer foreach_list macro.
Kenneth Graunke [Fri, 22 Nov 2013 09:25:42 +0000 (01:25 -0800)]
glsl: Convert piles of foreach_iter to the newer foreach_list macro.

foreach_iter and exec_list_iterators have been deprecated for some time now;
we just hadn't ever bothered to convert code to the newer foreach_list
and foreach_list_safe macros.

In these cases, we aren't editing the list, so we can use foreach_list
rather than foreach_list_safe.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agoi965: Ensure that all necessary state is re-emitted if we run out of aperture.
Paul Berry [Sat, 11 Jan 2014 02:56:14 +0000 (18:56 -0800)]
i965: Ensure that all necessary state is re-emitted if we run out of aperture.

Prior to this patch, if we ran out of aperture space during
brw_try_draw_prims(), we would rewind the batch buffer pointer
(potentially throwing some state that may have been emitted by
brw_upload_state()), flush the batch, and then try again.  However, we
wouldn't reset the dirty bits to the state they had before the call to
brw_upload_state().  As a result, when we tried again, there was a
danger that we wouldn't re-emit all the necessary state.  (Note: prior
to the introduction of hardware contexts, this wasn't a problem
because flushing the batch forced all state to be re-emitted).

This patch fixes the problem by leaving the dirty bits set at the end
of brw_upload_state(); we only clear them after we have determined
that we don't need to rewind the batch buffer.

Cc: 10.0 9.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agor600g: fix glClearBuffer by handling PIPE_CLEAR_COLORi flags correctly
Marek Olšák [Wed, 8 Jan 2014 12:56:30 +0000 (13:56 +0100)]
r600g: fix glClearBuffer by handling PIPE_CLEAR_COLORi flags correctly

also restructure the code

10 years agor600g: handle NULL colorbuffers correctly on R600-R700
Marek Olšák [Wed, 8 Jan 2014 17:13:24 +0000 (18:13 +0100)]
r600g: handle NULL colorbuffers correctly on R600-R700

10 years agor600g: handle NULL colorbuffers correctly on Evergreen
Marek Olšák [Wed, 8 Jan 2014 12:31:59 +0000 (13:31 +0100)]
r600g: handle NULL colorbuffers correctly on Evergreen

10 years agoradeonsi: handle NULL colorbuffers correctly
Marek Olšák [Wed, 8 Jan 2014 00:25:14 +0000 (01:25 +0100)]
radeonsi: handle NULL colorbuffers correctly

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
10 years agogallium/util: easy fixes for NULL colorbuffers
Marek Olšák [Wed, 8 Jan 2014 00:07:20 +0000 (01:07 +0100)]
gallium/util: easy fixes for NULL colorbuffers

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agost/mesa: bind NULL colorbuffers as specified by glDrawBuffers
Marek Olšák [Wed, 8 Jan 2014 00:09:15 +0000 (01:09 +0100)]
st/mesa: bind NULL colorbuffers as specified by glDrawBuffers

An example why it is required:

    Let's say there's a fragment shader writing to gl_FragData[0..1].
    The user calls: glDrawBuffers(2, {GL_NONE, GL_COLOR_ATTACHMENT0});

    That means gl_FragData[0] is unused and gl_FragData[1] is written
    to GL_COLOR_ATTACHMENT0.

st/mesa was skipping the GL_NONE draw buffer, therefore gl_FragData[0]
was written to GL_COLOR_ATTACHMENT0, which was wrong.

This commit fixes it, but drivers must also be fixed not to crash when
binding NULL colorbuffers. There is also a new set of piglit tests for this.

The MSAA state also had to be fixed not to crash when reading fb->cbufs[0].

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: handle GL_NONE draw buffers correctly in glClear
Marek Olšák [Wed, 8 Jan 2014 00:23:43 +0000 (01:23 +0100)]
mesa: handle GL_NONE draw buffers correctly in glClear

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agost/mesa: use sRGB formats for MSAA resolving if destination is sRGB
Marek Olšák [Tue, 7 Jan 2014 21:00:20 +0000 (22:00 +0100)]
st/mesa: use sRGB formats for MSAA resolving if destination is sRGB

Copied from the i965 driver, including the big comment.

Cc: 9.2 10.0 <mesa-stable@lists.freedesktop.org>
10 years agost/mesa: check depth and stencil writemask before clearing
Marek Olšák [Fri, 27 Dec 2013 18:10:03 +0000 (19:10 +0100)]
st/mesa: check depth and stencil writemask before clearing

10 years agost/mesa: always prefer pipe->clear over clear_with_quad (v2)
Marek Olšák [Fri, 6 Dec 2013 17:58:52 +0000 (18:58 +0100)]
st/mesa: always prefer pipe->clear over clear_with_quad (v2)

v2: clear depth and stencil together

10 years agost/egl: Flush resources before presentation
Martin Andersson [Thu, 26 Dec 2013 09:33:28 +0000 (10:33 +0100)]
st/egl: Flush resources before presentation

Fixes wayland regression on r600g due to fast clear introduced by commit
edbbfac6.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
10 years agodri: set yInverted default to GL_TRUE
Tapani Pälli [Wed, 8 Jan 2014 13:17:59 +0000 (15:17 +0200)]
dri: set yInverted default to GL_TRUE

yInverted is used by EGL_NOK_texture_from_pixmap to indicate that
window system rendering is y-inverted compared to OpenGL texture
representation. This extension is only known to be used with X11
window system where sane default is GL_TRUE.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73371

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
10 years agoegl_dri2: call dri2_add_configs_for_visuals after extensions set
Tapani Pälli [Wed, 8 Jan 2014 13:17:58 +0000 (15:17 +0200)]
egl_dri2: call dri2_add_configs_for_visuals after extensions set

dri2_add_config makes decisions based on NOK_texture_from_pixmap so
it needs to be enabled before calling dri2_add_configs_for_visuals.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
10 years agomesa: Set the correct error in _mesa_BeginConditionalRender
Ian Romanick [Thu, 9 Jan 2014 22:08:55 +0000 (14:08 -0800)]
mesa: Set the correct error in _mesa_BeginConditionalRender

Piglit was recently changed to expect the correct error code (piglit
commit 271b998), so it started failing on Mesa.  This corrects that
failing and adds some spec quotations to justify the errrors set.

The code was rearranged a little bit to match the order listed in the
spec.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoi965: Delete duplicate write_timestamp function.
Kenneth Graunke [Wed, 11 Dec 2013 22:55:45 +0000 (14:55 -0800)]
i965: Delete duplicate write_timestamp function.

brw_queryobj.c needs a version of write_timestamp that works on all
generations for the QueryCounter() driver hook.  So there's no point in
duplicating it in gen6_queryobj.c.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Eric Anholt <eric@anholt.net>
10 years agoi965: Fix clears of layered framebuffers with mismatched layer counts.
Paul Berry [Tue, 7 Jan 2014 14:29:47 +0000 (06:29 -0800)]
i965: Fix clears of layered framebuffers with mismatched layer counts.

Previously, Mesa enforced the following rule (from
ARB_geometry_shader4's list of criteria for framebuffer completeness):

  * If any framebuffer attachment is layered, all attachments must have
    the same layer count.  For three-dimensional textures, the layer count
    is the depth of the attached volume.  For cube map textures, the layer
    count is always six.  For one- and two-dimensional array textures, the
    layer count is simply the number of layers in the array texture.
    { FRAMEBUFFER_INCOMPLETE_LAYER_COUNT_ARB }

However, when ARB_geometry_shader4 was adopted into GL 3.2, this rule
was dropped; GL 3.2 permits different attachments to have different
layer counts.  This patch brings Mesa in line with GL 3.2.

In order to ensure that layered clears properly clear all layers, we
now have to keep track of the maximum number of layers in a layered
framebuffer.

Fixes the following piglit tests in spec/!OpenGL 3.2/layered-rendering:
- clear-color-all-types 1d_array mipmapped
- clear-color-all-types 1d_array single_level
- clear-color-mismatched-layer-count
- framebuffer-layer-count-mismatch

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agomain: check texture target when validating layered framebuffers.
Paul Berry [Wed, 20 Nov 2013 03:01:37 +0000 (19:01 -0800)]
main: check texture target when validating layered framebuffers.

From section 4.4.4 (Framebuffer Completeness) of the GL 3.2 spec:

    If any framebuffer attachment is layered, all populated
    attachments must be layered. Additionally, all populated color
    attachments must be from textures of the same target.

We weren't checking that the attachments were from textures of the
same target.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
10 years agoi965/gen6/blorp: Remove redundant HiZ workaround
Chad Versace [Tue, 24 Dec 2013 01:49:21 +0000 (17:49 -0800)]
i965/gen6/blorp: Remove redundant HiZ workaround

Commit 1a92881 added extra flushes to fix a HiZ hang in
WebGL Google Maps. With the extra flushes emitted by the previous two
patches, the flushes added by 1a92881 are redundant.

Tested with the same criteria as in 1a92881: by zooming in and out
continuously for 2 hours on Sandybridge Chrome OS (codename
Stumpy) without a hang.

CC: Kenneth Graunke <kenneth@whitecape.org>
CC: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
10 years agoi965/gen6/blorp: Set need_workaround_flush at top of blorp
Chad Versace [Tue, 24 Dec 2013 01:48:45 +0000 (17:48 -0800)]
i965/gen6/blorp: Set need_workaround_flush at top of blorp

Unconditionally set brw->need_workaround_flush at the top of gen6 blorp
state emission.

The art of emitting workaround flushes on Sandybridge is mysterious and
not fully understood. Ken and I believe that
intel_emit_post_sync_nonzero_flush() may be required when switching from
regular drawing to blorp.  This is an extra safety measure to prevent
undiscovered difficult-to-diagnose gpu hangs.

I verified that on ChromeOS, pre-patch, need_workaround_flush was not
set at the top of blorp, as Paul expected. To verify, I inserted the
following debug code at the top of gen6_blorp_exec(), restarted the ui,
and inspected the logs in /var/log/ui. The abort gets triggered so early
that the browser never appears on the display.

    static void
    gen6_blorp_exec(...)
    {
        if (!brw->need_workaround_flush) {
            fprintf(stderr, "chadv: %s:%d\n", __FILE__, __LINE__);
            abort();
        }
        ...
    }

CC: Kenneth Graunke <kenneth@whitecape.org>
CC: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
10 years agoi965/gen6/blorp: Set need_workaround_flush immediately after primitive
Chad Versace [Tue, 24 Dec 2013 01:46:51 +0000 (17:46 -0800)]
i965/gen6/blorp: Set need_workaround_flush immediately after primitive

This patch makes the workaround code in gen6 blorp follow the pattern
established in the regular draw path. It shouldn't result in any
behavioral change.

On gen6, there are two places where we emit 3D_CMD_PRIM: brw_emit_prim()
and gen6_blorp_emit_primitive().  brw_emit_prim() sets
need_workaround_flush immediately after emitting the primitive, but
blorp does not. Blorp sets need_workaround_flush at the bottom of
brw_blorp_exec().

This patch moves the need_workaround_flush from brw_blorp_exec() to
gen6_blorp_emit_primitive().  There is no need to set
need_workaround_flush in gen7_blorp_emit_primitive() because the
workaround applies only to gen6.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
10 years agodocs: Import 10.0.2 release notes, add news item.
Carl Worth [Thu, 9 Jan 2014 20:05:28 +0000 (12:05 -0800)]
docs: Import 10.0.2 release notes, add news item.

10 years agomesa: add missing SNORM formats in _mesa_base_fbo_format()
Brian Paul [Wed, 8 Jan 2014 16:05:29 +0000 (09:05 -0700)]
mesa: add missing SNORM formats in _mesa_base_fbo_format()

We weren't handling the LUMINANCE_SNORM, LUMINANCE_ALPHA_SNORM and
INTENSITY_SNORM cases.  Note that adding these cases here does not
require a driver to support rendering to these surface types.  If
the driver can't do it we'll report an incomplete framebuffer.

NVIDIA doesn't support GL_EXT_texture_snorm but their driver
accepts these formats in glRenderBufferStorage().

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
10 years agomesa: remove dead geom shader code
Brian Paul [Tue, 7 Jan 2014 23:13:56 +0000 (16:13 -0700)]
mesa: remove dead geom shader code

I doubt the swrast-based drivers will ever support GS.

Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agodocs: minor updates to VMware SVGA3D driver page
Brian Paul [Tue, 7 Jan 2014 17:50:21 +0000 (10:50 -0700)]
docs: minor updates to VMware SVGA3D driver page

Signed-off-by: Brian Paul <brianp@vmware.com>
10 years agomesa: check bits per channel for GL_RGBA_SIGNED_COMPONENTS_EXT query
Brian Paul [Tue, 7 Jan 2014 16:05:27 +0000 (09:05 -0700)]
mesa: check bits per channel for GL_RGBA_SIGNED_COMPONENTS_EXT query

If a channel has zero bits it's not signed.

v2: also check for luminance and intensity format bits.  Bruce
Merry's proposed piglit test hits the luminance case.

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=73096
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agomesa: check for MESA_FORMAT_RGB9_E5_FLOAT in _mesa_is_format_signed()
Brian Paul [Tue, 7 Jan 2014 16:05:03 +0000 (09:05 -0700)]
mesa: check for MESA_FORMAT_RGB9_E5_FLOAT in _mesa_is_format_signed()

This packed floating point format only stores positive values.

Bugzilla: http://bugs.freedesktop.org/show_bug.cgi?id=73096
Cc: 10.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agost/mesa: fix breakage from gl_constant::Program[] change
Brian Paul [Thu, 9 Jan 2014 17:57:22 +0000 (10:57 -0700)]
st/mesa: fix breakage from gl_constant::Program[] change

10 years agomesa: Use functions to convert gl_shader_stage to PROGRAM enum or pipe target.
Paul Berry [Wed, 8 Jan 2014 19:09:58 +0000 (11:09 -0800)]
mesa: Use functions to convert gl_shader_stage to PROGRAM enum or pipe target.

Suggested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: Improve assert message.

10 years agomain: Change init_program_limits() to use gl_shader_stage.
Paul Berry [Wed, 8 Jan 2014 18:32:18 +0000 (10:32 -0800)]
main: Change init_program_limits() to use gl_shader_stage.

This allows the caller to execute it in a loop rather than
hand-rolling a separate call for each stage.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoglsl: Index into ctx->Const.Program[] rather than using ad-hoc code.
Paul Berry [Wed, 8 Jan 2014 18:17:01 +0000 (10:17 -0800)]
glsl: Index into ctx->Const.Program[] rather than using ad-hoc code.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agomesa: Index into ctx->Const.Program[] rather than using ad-hoc code.
Paul Berry [Wed, 8 Jan 2014 18:17:01 +0000 (10:17 -0800)]
mesa: Index into ctx->Const.Program[] rather than using ad-hoc code.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agomesa: replace ctx->Const.{Vertex,Fragment,Geomtery}Program with an array.
Paul Berry [Wed, 8 Jan 2014 18:00:28 +0000 (10:00 -0800)]
mesa: replace ctx->Const.{Vertex,Fragment,Geomtery}Program with an array.

These are replaced with
ctx->Const.Program[MESA_SHADER_{VERTEX,FRAGMENT,GEOMETRY}].  In
patches to follow, this will allow us to replace a lot of ad-hoc logic
with a variable index into the array.

With the exception of the changes to mtypes.h, this patch was
generated entirely by the command:

    find src -type f '(' -iname '*.c' -o -iname '*.cpp' -o -iname '*.py' \
    -o -iname '*.y' ')' -print0 | xargs -0 sed -i \
    -e 's/Const\.VertexProgram/Const.Program[MESA_SHADER_VERTEX]/g' \
    -e 's/Const\.GeometryProgram/Const.Program[MESA_SHADER_GEOMETRY]/g' \
    -e 's/Const\.FragmentProgram/Const.Program[MESA_SHADER_FRAGMENT]/g'

Suggested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agollvmpipe: Honour pipe_rasterizer::point_quad_rasterization.
José Fonseca [Wed, 8 Jan 2014 17:19:41 +0000 (17:19 +0000)]
llvmpipe: Honour pipe_rasterizer::point_quad_rasterization.

Commit eda21d2a3010d9fc5a68b55a843c5e44b2abf8dd fixed the rasterization
of points for Direct3D but ended up breaking the rasterization of OpenGL
non-sprite points, in particular conform's pntrast.c test.

The only way to get both working is to properly honour
pipe_rasterizer::point_quad_rasterization, and follow the weird OpenGL
rule when it is false.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
10 years agoi965: Don't do the temporary-and-blit-copy for INVALIDATE_RANGE maps.
Eric Anholt [Tue, 24 Dec 2013 23:11:54 +0000 (15:11 -0800)]
i965: Don't do the temporary-and-blit-copy for INVALIDATE_RANGE maps.

We definitely want to fall through to the unsynchronized map case, instead
of wasting bandwidth on a copy.  Prevents a -43.2407% +/- 1.06113% (n=49)
performance regression on aa10perf when teaching glamor to provide the
GL_INVALIDATE_RANGE_BIT information.

This is a performance fix, which I usually wouldn't cherry-pick to stable.
But this was really was just a bug in the code, its presence would
discourage developers from giving us the best information they can, and I
think we've got fairly high confidence in the unsynchronized map path
already.

Cc: 10.0 9.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoi965: Fix handling of MESA_pack_invert in blit (PBO) readpixels.
Eric Anholt [Mon, 23 Dec 2013 20:11:25 +0000 (12:11 -0800)]
i965: Fix handling of MESA_pack_invert in blit (PBO) readpixels.

Fixes piglit GL_MESA_pack_invert/readpixels and GPU hangs with glamor and
cairo-gl.

Cc: 10.0 9.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agoi965: Fix incorrect bounds tracking for blit readpixels's GPU access.
Eric Anholt [Mon, 23 Dec 2013 09:56:26 +0000 (01:56 -0800)]
i965: Fix incorrect bounds tracking for blit readpixels's GPU access.

While incorrect, it probably wouldn't affect anyone ever: You'd have to do
an appropriately-formatted readpixels into a PBO, then overwrite the tail
end of the updated area of the PBO with glBufferSubData(), and you
wouldn't get appropriate synchronization.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agoi965: Use SET_FIELD to safety check our x/y offsets in blits.
Eric Anholt [Mon, 23 Dec 2013 09:48:09 +0000 (01:48 -0800)]
i965: Use SET_FIELD to safety check our x/y offsets in blits.

The earlier assert made sure that our math didn't exceed our bounds, but
this makes sure that we don't overflow from the high bits X into the low
bits of Y.  We've already put checks in intel_miptree_blit(), but I've
wanted to expand the type in our protoype from short to uint32_t, and we
could get in trouble with intel_emit_linear_blit() if we did.

v2: Add Ken's comment about the funny language extension used.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> (v1)
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (v1)
10 years agoi965: Add an assert for when SET_FIELD's value exceeds the field size.
Eric Anholt [Mon, 23 Dec 2013 09:39:42 +0000 (01:39 -0800)]
i965: Add an assert for when SET_FIELD's value exceeds the field size.

This was one of the things we always wanted to do to this, to make it more
useful than just (value << FIELD_MASK).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agoi965: Add a safety check for emitting blits.
Eric Anholt [Mon, 23 Dec 2013 09:26:56 +0000 (01:26 -0800)]
i965: Add a safety check for emitting blits.

With all of the flipping and pitch twiddling and miptree layout involved
in our blits, there are lots of ways for us to scribble outside of a
buffer.  Put in a check that we're not about to do so.

This catches a bug that glamor was running into.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agoi965: Don't call the blitter on addresses it can't handle.
Eric Anholt [Mon, 23 Dec 2013 23:30:03 +0000 (15:30 -0800)]
i965: Don't call the blitter on addresses it can't handle.

Noticed by tex3d-maxsize on my next commit to check that our addresses
don't overflow.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
10 years agomesa: Namespace qualify fma to override ambiguity with fma from math.h
Thomas Sondergaard [Tue, 7 Jan 2014 20:31:00 +0000 (13:31 -0700)]
mesa: Namespace qualify fma to override ambiguity with fma from math.h

MSVC 2013 version of math.h includes an fma() function.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Work around internal compiler error
Thomas Sondergaard [Tue, 7 Jan 2014 20:31:00 +0000 (13:31 -0700)]
mesa: Work around internal compiler error

This small rearrangement avoids MSVC 2013 ICE. Also, this should be
a better memory access order.

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
10 years agomesa: Fix compile error with MSVC 2013
Thomas Sondergaard [Tue, 7 Jan 2014 20:31:00 +0000 (13:31 -0700)]
mesa: Fix compile error with MSVC 2013

This fixes the following compile error:
src\glsl\ir_constant_expression.cpp(1405) : error C2666: 'copysign' : 3
overloads have similar conversions

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Preliminary support for MSVC_VERSION=12.0
Thomas Sondergaard [Tue, 7 Jan 2014 20:31:00 +0000 (13:31 -0700)]
mesa: Preliminary support for MSVC_VERSION=12.0

Cc: "10.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agofreedreno: add basic query support
Rob Clark [Wed, 8 Jan 2014 02:39:13 +0000 (21:39 -0500)]
freedreno: add basic query support

Add for now some simple/basic query support (ie. things not actually
requiring the GPU).  Might change around a bit when I actually add
GPU queries, but for now this enables some useful performance info
in the GALLIUM_HUD.  For example:

  GALLIUM_HUD=fps+batches+batches-sysmem+batches-gmem+restores,draw-calls

The driver specific specific queries are:

  + draw-calls
  + batches - number of batches per second, sum of batches-sysmem
    plus batches-gmem
  + batches-gmem - render a set of tiles in GMEM, for each tile
    (optionally) system mem -> gmem (restore), plus N draws,
    plus gmem -> system mem (resolve) per second
  + batches-sysmem - N draws to system memory (GMEM bypass) per
    second
  + restores - number of GMEM batches that required restore per
    second

Ideally for GMEM rendering, you want batches-gmem to equal fps.  If
the app is doing something that triggers multiple passes (ie. requires
extra round trip gmem <-> system memory) then the # of batches per
second will go up relative to fps.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx: use cs patch instead of RFI+RMW
Rob Clark [Wed, 8 Jan 2014 15:06:52 +0000 (10:06 -0500)]
freedreno/a3xx: use cs patch instead of RFI+RMW

Since we now have the cmdstream patch mechanism needed for hw binning,
might as well also use it for RB_RENDER_CONTROL updates.  This avoids
the need to use RMW (and associated WFI) to update RB_RENDER_CONTROL.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno/a3xx: support for hw binning pass
Rob Clark [Tue, 7 Jan 2014 15:55:07 +0000 (10:55 -0500)]
freedreno/a3xx: support for hw binning pass

The binning pass sorts vertices into which bins/tiles they apply to.
The visibility information generated during the binning pass can be
used to speed up the rendering pass by filtering out vertices which
do not apply to the current tile.  See:

 https://github.com/freedreno/freedreno/wiki/Adreno-tiling#optimized-approach

This brings a significant fps boost.  A rough assortment of tests
(supertuxkart, etracer, tremulous, glmark2 'build' test, etc) seems
to yield a ~35-45% fps improvement.

For now, to be conservative, the binning pass is not enabled yet by
default.  To enable it use:

  FD_MESA_DEBUG=binning

So far I haven't found anything that breaks with binning enabled,
but I'd like a bit more testing before I enable it as default.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: be more clever about gmem usage
Rob Clark [Fri, 27 Dec 2013 15:31:22 +0000 (10:31 -0500)]
freedreno: be more clever about gmem usage

Only need to leave room for depth/stencil if it is actually used, etc.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agofreedreno: resync generated headers
Rob Clark [Tue, 7 Jan 2014 14:49:42 +0000 (09:49 -0500)]
freedreno: resync generated headers

Signed-off-by: Rob Clark <robclark@freedesktop.org>
10 years agoi965: fold offset into coord for textureOffset(gsampler2DRect)
Chris Forbes [Wed, 18 Dec 2013 08:27:59 +0000 (21:27 +1300)]
i965: fold offset into coord for textureOffset(gsampler2DRect)

The hardware is broken with nonzero texel offsets and unnormalized
coordinates; instead of doing correct offsetting, we get garbage.

This just extends the existing workaround for ir_txf and
ir_tg4+gsampler2DRect to also consider ir_tex+gsampler2DRect.

Fixes broken rendering in 'tesseract' when 'mesa_texrectoffset_bug' is
not enabled; also fixes the new piglit test
'tests/spec/glsl-1.30/execution/fs-textureOffset-Rect'.

Has been broken ~forever; suggesting including this in only 10.0 because
the lowering pass doesn't exist in 9.2 or earlier so would require quite
a different patch.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: Lee Salzman <lsalzman@gmail.com>
Cc: "10.0" <mesa-stable@lists.freedesktop.org>
10 years agomesa: Remove _mesa_progshader_enum_to_string(), which is no longer used.
Paul Berry [Tue, 7 Jan 2014 19:40:00 +0000 (11:40 -0800)]
mesa: Remove _mesa_progshader_enum_to_string(), which is no longer used.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoglsl: Make more use of gl_shader_stage enum in ir_set_program_inouts.cpp.
Paul Berry [Tue, 7 Jan 2014 19:23:34 +0000 (11:23 -0800)]
glsl: Make more use of gl_shader_stage enum in ir_set_program_inouts.cpp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoglsl: Make more use of gl_shader_stage enum in lower_clip_distance.cpp.
Paul Berry [Tue, 7 Jan 2014 19:19:22 +0000 (11:19 -0800)]
glsl: Make more use of gl_shader_stage enum in lower_clip_distance.cpp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoglsl: Make more use of gl_shader_stage enum in link_varyings.cpp.
Paul Berry [Tue, 7 Jan 2014 19:13:32 +0000 (11:13 -0800)]
glsl: Make more use of gl_shader_stage enum in link_varyings.cpp.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: Also rename "shaderType" param of is_varying_var() to "stage".

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agoglsl: Change _mesa_glsl_parse_state ctor to use gl_shader_stage enum.
Paul Berry [Tue, 7 Jan 2014 17:46:10 +0000 (09:46 -0800)]
glsl: Change _mesa_glsl_parse_state ctor to use gl_shader_stage enum.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: Also rename "target" param to "stage".

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agomesa: Use gl_shader::Stage instead of gl_shader::Type where possible.
Paul Berry [Tue, 7 Jan 2014 18:58:56 +0000 (10:58 -0800)]
mesa: Use gl_shader::Stage instead of gl_shader::Type where possible.

This reduces confusion since gl_shader::Type is sometimes
GL_SHADER_PROGRAM_MESA but is more frequently
GL_SHADER_{VERTEX,GEOMETRY,FRAGMENT}.  It also has the advantage that
when switching on gl_shader::Stage, the compiler will alert if one of
the possible enum types is unhandled.  Finally, many functions in
src/glsl (especially those dealing with linking) already use
gl_shader_stage to represent pipeline stages; using gl_shader::Stage
in those functions avoids the need for a conversion.

Note: in the process I changed _mesa_write_shader_to_file() so that if
it encounters an unexpected shader stage, it will use a file suffix of
"????" rather than "geom".

Reviewed-by: Brian Paul <brianp@vmware.com>
v2: Split from patch "mesa: Store gl_shader_stage enum in gl_shader objects."

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agomesa: Store gl_shader_stage enum in gl_shader objects.
Paul Berry [Tue, 7 Jan 2014 18:58:56 +0000 (10:58 -0800)]
mesa: Store gl_shader_stage enum in gl_shader objects.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agomesa: Move declaration of gl_shader_stage earlier in mtypes.h.
Paul Berry [Tue, 7 Jan 2014 18:58:56 +0000 (10:58 -0800)]
mesa: Move declaration of gl_shader_stage earlier in mtypes.h.

Also move the related #define MESA_SHADER_STAGES.  This will allow
gl_shader_stage to be used in struct gl_shader.

Reviewed-by: Brian Paul <brianp@vmware.com>
v2: Split from patch "mesa: Store gl_shader_stage enum in gl_shader objects."

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agoglsl: make _mesa_shader_stage_to_string() available to non-C++ code.
Paul Berry [Tue, 7 Jan 2014 18:58:56 +0000 (10:58 -0800)]
glsl: make _mesa_shader_stage_to_string() available to non-C++ code.

Reviewed-by: Brian Paul <brianp@vmware.com>
v2: Split from patch "mesa: Store gl_shader_stage enum in gl_shader objects."

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
10 years agomesa: Clean up nomenclature for pipeline stages.
Paul Berry [Tue, 7 Jan 2014 18:11:39 +0000 (10:11 -0800)]
mesa: Clean up nomenclature for pipeline stages.

Previously, we had an enum called gl_shader_type which represented
pipeline stages in the order they occur in the pipeline
(i.e. MESA_SHADER_VERTEX=0, MESA_SHADER_GEOMETRY=1, etc), and several
inconsistently named functions for converting between it and other
representations:

- _mesa_shader_type_to_string: gl_shader_type -> string
- _mesa_shader_type_to_index: GLenum (GL_*_SHADER) -> gl_shader_type
- _mesa_program_target_to_index: GLenum (GL_*_PROGRAM) -> gl_shader_type
- _mesa_shader_enum_to_string: GLenum (GL_*_{SHADER,PROGRAM}) -> string

This patch tries to clean things up so that we use more consistent
terminology: the enum is now called gl_shader_stage (to emphasize that
it is in the order of pipeline stages), and the conversion functions are:

- _mesa_shader_stage_to_string: gl_shader_stage -> string
- _mesa_shader_enum_to_shader_stage: GLenum (GL_*_SHADER) -> gl_shader_stage
- _mesa_program_enum_to_shader_stage: GLenum (GL_*_PROGRAM) -> gl_shader_stage
- _mesa_progshader_enum_to_string: GLenum (GL_*_{SHADER,PROGRAM}) -> string

In addition, MESA_SHADER_TYPES has been renamed to MESA_SHADER_STAGES,
for consistency with the new name for the enum.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: Also rename the "target" field of _mesa_glsl_parse_state and the
"target" parameter of _mesa_shader_stage_to_string to "stage".

Reviewed-by: Brian Paul <brianp@vmware.com>
10 years agollvmpipe: Fix the bottom_edge_rule adjustment for points.
José Fonseca [Tue, 7 Jan 2014 17:57:59 +0000 (17:57 +0000)]
llvmpipe: Fix the bottom_edge_rule adjustment for points.

The adjustment needs to be applied to the y coordinates and not the x
coordinates, just like the equivalent code for lines and triangles in
lp_setup_line.c and lp_setup_tri.c.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
10 years agollvmpipe: Respect bottom_edge_rule when computing the rasterization bounding boxes.
José Fonseca [Tue, 7 Jan 2014 17:52:21 +0000 (17:52 +0000)]
llvmpipe: Respect bottom_edge_rule when computing the rasterization bounding boxes.

This was inadvertently forgotten when replacing gl_rasterization_rules
with lower_left_origin and half_pixel_center (commit
2737abb44efebfa10ac84b183c20fc5818d1514e).

This makes a difference when lower_left_origin != half_pixel_center, e.g,
D3D10.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
10 years agoilo: enable HiZ
Chia-I Wu [Mon, 6 Jan 2014 15:32:46 +0000 (23:32 +0800)]
ilo: enable HiZ

The support is still early.  Fast depth buffer clear is not enabled yet.

HiZ can be forced off with ILO_DEBUG=nohiz.

10 years agoilo: resolve Z/HiZ correctly
Chia-I Wu [Tue, 7 Jan 2014 03:57:42 +0000 (11:57 +0800)]
ilo: resolve Z/HiZ correctly

When the depth buffer is to be read, perform a Depth Buffer Resolve if it has
been rendered.  When the depth buffer is to be rendered, perform a HiZ Buffer
Resolve when the depth buffer is modified externally.

10 years agoilo: add flags to texture slices
Chia-I Wu [Thu, 26 Dec 2013 04:03:44 +0000 (12:03 +0800)]
ilo: add flags to texture slices

The flags are used to mark who (CPU, BLT, or RENDER) has accessed the resource
and how (READ or WRITE).

10 years agoilo: rename and add an accessor for texture slices
Chia-I Wu [Thu, 26 Dec 2013 03:46:25 +0000 (11:46 +0800)]
ilo: rename and add an accessor for texture slices

Rename ilo_texture::slice_offsets to ilo_texture::slices and add an accessor,
ilo_texture_get_slice().

10 years agoilo: add HiZ op support to the pipelines
Chia-I Wu [Sat, 28 Dec 2013 07:57:49 +0000 (15:57 +0800)]
ilo: add HiZ op support to the pipelines

Add blitter functions to perform Depth Buffer Clear, Depth Buffer Resolve, and
Hierarchical Depth Buffer Resolve.  Those functions set ilo_blitter up and
pass it to the pipelines to emit the commands.

10 years agoilo: add support for HiZ allocation
Chia-I Wu [Mon, 6 Jan 2014 15:32:32 +0000 (23:32 +0800)]
ilo: add support for HiZ allocation

Add tex_create_hiz() to create HiZ bo.  It is not really called yet.

10 years agoilo: refactor separate stencil allocation
Chia-I Wu [Sat, 21 Dec 2013 13:21:24 +0000 (21:21 +0800)]
ilo: refactor separate stencil allocation

Move separate stencil allocation code to tex_create_separate_stencil to keep
tex_create sane.

10 years agoilo: assorted GPE fixes for HiZ
Chia-I Wu [Mon, 6 Jan 2014 15:32:56 +0000 (23:32 +0800)]
ilo: assorted GPE fixes for HiZ

Allow HiZ op to be specified in 3DSTATE_WM.  Pass depth format directly in
gen7_emit_3DSTATE_SF.  Use tex->hiz.bo to determine if HiZ exists.  Fix
3DSTATE_SF for the case when there is no ilo_rasterizer_state.  Fix
3DSTATE_PS for the case when there is no ilo_shader_state.

10 years agoilo: no layer offsetting on GEN7+
Chia-I Wu [Sat, 21 Dec 2013 12:09:49 +0000 (20:09 +0800)]
ilo: no layer offsetting on GEN7+

Even though the Ivy Bridge PRM lists some restrictions that require layer
offsetting as the Sandy Bridge PRM does, it seems they are actually lifted.

10 years agoilo: offset to layers only when necessary
Chia-I Wu [Fri, 20 Dec 2013 06:45:59 +0000 (14:45 +0800)]
ilo: offset to layers only when necessary

GEN6 has several requirements regarding the LOD/Depth/Width/Height of the
render targets and the depth buffer.  We used to offset to the layers in
question unconditionally to meet the requirements.  With this commit,
offseting is done only when the requirements are not met.

10 years agoilo: allow ilo_zs_surface to skip layer offsetting
Chia-I Wu [Fri, 20 Dec 2013 16:31:33 +0000 (00:31 +0800)]
ilo: allow ilo_zs_surface to skip layer offsetting

Make offset to layer optional in ilo_gpe_init_zs_surface.

10 years agoilo: allow ilo_view_surface to skip layer offsetting
Chia-I Wu [Fri, 20 Dec 2013 15:59:34 +0000 (23:59 +0800)]
ilo: allow ilo_view_surface to skip layer offsetting

Make offset to layer optional in ilo_gpe_init_view_surface_for_texture.
render_cache_rw is always the same as is_rt and is replaced.

10 years agoi965/fs: do SEL optimization only when src type for MOV matches
Tapani Pälli [Tue, 7 Jan 2014 08:25:40 +0000 (10:25 +0200)]
i965/fs: do SEL optimization only when src type for MOV matches

Fixes a bug where then branch operates with ivec4 while else uses vec4.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72379

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Optimize pow(2, x) --> exp2(x).
Kenneth Graunke [Mon, 6 Jan 2014 06:57:01 +0000 (22:57 -0800)]
glsl: Optimize pow(2, x) --> exp2(x).

On Haswell, POW takes 24 cycles, while EXP2 only takes 14.  Plus, using
POW requires putting 2.0 in a register, while EXP2 doesn't.

I believe that EXP2 will be faster than POW on basically all GPUs, so
it makes sense to optimize it.

Looking at the savage2 subset of shader-db:
total instructions in shared programs: 113225 -> 113179 (-0.04%)
instructions in affected programs:     2139 -> 2093 (-2.15%)
instances of 'math pow':               795 -> 749 (-6.14%)
instances of 'math exp':               389 -> 435 (11.8%)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Refactor is_zero/one/negative_one into an is_value() method.
Kenneth Graunke [Mon, 6 Jan 2014 06:42:31 +0000 (22:42 -0800)]
glsl: Refactor is_zero/one/negative_one into an is_value() method.

This patch creates a new generic is_value() method, which checks if an
ir_constant has a particular value.  (For vectors, it must have the
single value repeated across all components.)

It then rewrites the is_zero/is_one/is_negative_one methods to use this
generic helper.  All three were basically identical except for the value
they checked for.  The other difference is that is_negative_one rejects
boolean types.  The new is_value function maintains this behavior, only
allowing boolean types when checking for 0 or 1.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agoglsl: Optimize pow(1.0, X) --> 1.0.
Kenneth Graunke [Mon, 6 Jan 2014 06:19:42 +0000 (22:19 -0800)]
glsl: Optimize pow(1.0, X) --> 1.0.

Surprisingly, this helps one vertex shader in 3DMMES.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
10 years agomesa: Use get_local_param_pointer in glProgramLocalParameters4fvEXT().
Kenneth Graunke [Mon, 6 Jan 2014 04:03:00 +0000 (20:03 -0800)]
mesa: Use get_local_param_pointer in glProgramLocalParameters4fvEXT().

Using the get_local_param_pointer helper ensures that the LocalParams
arrays have actually been allocated before attempting to use them.

glProgramLocalParameters4fvEXT needs to do a bit of extra checking,
but it can be simplified since the helper has already validated the
target.

Fixes crashes in programs that use Cg (for example, Awesomenauts,
Rocketbirds: Hardboiled Chicken, and Tiny and Big: Grandpa's Leftovers)
since commit e5885c119de1e508099cc1111e1c9f8ff00fab88
(mesa: Dynamically allocate the storage for program local parameters.)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73136
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Laurent Carlier <lordheavym@gmail.com>