mesa.git
9 years agor600g: Remove dead assigment to 'gs_input_prim' in shader state
Edward O'Callaghan [Sat, 29 Aug 2015 08:31:06 +0000 (18:31 +1000)]
r600g: Remove dead assigment to 'gs_input_prim' in shader state

Note that 'geometry shader properties' should be carried in the
selector state over the shader state in any case.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
9 years agoradeonsi: don't use the emit qt keyword in si_init_atom
Marek Olšák [Tue, 25 Aug 2015 17:21:38 +0000 (19:21 +0200)]
radeonsi: don't use the emit qt keyword in si_init_atom

It confuses my editor.

9 years agoradeonsi: remove no-op 32-bit masking
Marek Olšák [Sun, 23 Aug 2015 11:05:53 +0000 (13:05 +0200)]
radeonsi: remove no-op 32-bit masking

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
9 years agogallium/radeon: fix the ADDRESS_HI mask for EVENT_WRITE CIK packets
Marek Olšák [Sun, 23 Aug 2015 10:57:09 +0000 (12:57 +0200)]
gallium/radeon: fix the ADDRESS_HI mask for EVENT_WRITE CIK packets

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
9 years agowinsys/radeon: handle non-zero finite timeout when waiting for buffers
Marek Olšák [Sat, 22 Aug 2015 16:05:37 +0000 (18:05 +0200)]
winsys/radeon: handle non-zero finite timeout when waiting for buffers

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
9 years agofreedreno/a3xx: implement half-z clipping
Ilia Mirkin [Wed, 26 Aug 2015 04:11:23 +0000 (00:11 -0400)]
freedreno/a3xx: implement half-z clipping

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agofreedreno/a3xx: add basic clip plane support
Ilia Mirkin [Tue, 25 Aug 2015 03:31:00 +0000 (23:31 -0400)]
freedreno/a3xx: add basic clip plane support

The hardware is capable of dealing with GL1-style user clip planes.
No clip vertex, no clip distances. Fixes a number of ucp tests, as well
as neverball.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
9 years agonvc0: change prefix of MP performance counters to HW_SM
Samuel Pitoiset [Sat, 29 Aug 2015 08:58:49 +0000 (10:58 +0200)]
nvc0: change prefix of MP performance counters to HW_SM

According to NVIDIA, local performance counters (MP) are prefixed
with SM, while global performance counters (PCOUNTER) are called PM.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
9 years agonvc0: sort performance counter queries by name
Samuel Pitoiset [Fri, 28 Aug 2015 17:09:33 +0000 (19:09 +0200)]
nvc0: sort performance counter queries by name

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
9 years agonvc0: make names of performance counter queries consistent
Samuel Pitoiset [Fri, 28 Aug 2015 16:41:16 +0000 (18:41 +0200)]
nvc0: make names of performance counter queries consistent

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
9 years agonvc0: use enumerations for driver queries
Samuel Pitoiset [Fri, 28 Aug 2015 16:30:13 +0000 (18:30 +0200)]
nvc0: use enumerations for driver queries

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
9 years agonvc0: remove commented out code related to PCOUNTER queries
Samuel Pitoiset [Fri, 28 Aug 2015 16:15:13 +0000 (18:15 +0200)]
nvc0: remove commented out code related to PCOUNTER queries

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
9 years agor600: port si_conv_prim_to_gs_out from radeonsi
Dave Airlie [Fri, 28 Aug 2015 00:46:10 +0000 (10:46 +1000)]
r600: port si_conv_prim_to_gs_out from radeonsi

This code was broken by the tess merge, and I totally missed it
until now. I'm not sure this fixes anything but it stops the assert.

Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600g: use PRIi64 for some compute debug printfs
Dave Airlie [Thu, 27 Aug 2015 23:58:15 +0000 (09:58 +1000)]
r600g: use PRIi64 for some compute debug printfs

Otherwise this will crash on 32-bit, and it gets rid of
warnings building on 32-bit.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agogallium/util: fix debug_get_flags_option on 32-bit
Dave Airlie [Thu, 27 Aug 2015 23:57:04 +0000 (09:57 +1000)]
gallium/util: fix debug_get_flags_option on 32-bit

On 32-bit we need to use PRIu64 flags for printfs,
otherwise this segfaults in R600_DEBUG=help otherwise.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agoglsl: provide the option of using BFE for unpack builting lowering
Ilia Mirkin [Fri, 21 Aug 2015 01:55:52 +0000 (21:55 -0400)]
glsl: provide the option of using BFE for unpack builting lowering

This greatly improves generated code, especially for the snorm variants,
since it is able to get rid of the lshift/rshift for sext, as well as
replacing each shift + mask with a single op.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agoglsl: use bitfield_insert instead of and + shift + or for packing
Ilia Mirkin [Fri, 21 Aug 2015 00:52:32 +0000 (20:52 -0400)]
glsl: use bitfield_insert instead of and + shift + or for packing

It is fairly tricky to detect the proper conditions for using bitfield
insert, but easy to just use it up front. This removes a lot of
instructions on nvc0 when invoking the packing builtins.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agoi965/fs: Remove fs_visitor::try_replace_with_sel().
Matt Turner [Mon, 17 Aug 2015 21:38:31 +0000 (14:38 -0700)]
i965/fs: Remove fs_visitor::try_replace_with_sel().

No shader-db changes on g4x, snb, hsw, or bdw.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965/fs: Replace awful variable names.
Matt Turner [Fri, 28 Aug 2015 01:30:34 +0000 (18:30 -0700)]
i965/fs: Replace awful variable names.

   start_to      -> dst_start
   end_to        -> dst_end
   start_from    -> src_start
   end_from      -> src_end
   var_to        -> dst_var
   var_from      -> src_var
   reg_to        -> dst_reg
   reg_to_offset -> dst_reg_offset
   reg_from      -> src_reg

Not sure how these made sense to me before.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965/fs: Skip blocks in register coalescing interference check.
Matt Turner [Wed, 19 Aug 2015 00:47:00 +0000 (17:47 -0700)]
i965/fs: Skip blocks in register coalescing interference check.

No need to walk through instructions in blocks we know don't contain our
registers' live ranges.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965/fs: Improve register coalescing interference check.
Matt Turner [Mon, 17 Aug 2015 23:03:27 +0000 (16:03 -0700)]
i965/fs: Improve register coalescing interference check.

I always thought that the is_control_flow() -> return false check was a
bad hack, and some previous attempts to remove it have failed and have
been reverted.

The previous two patches fix some problems that caused register
coalescing to not notice some interference between registers, which the
is_control_flow() check apparently works around.

With that fixed, we can calculate interference more accurately.

total instructions in shared programs: 6261319 -> 6257917 (-0.05%)
instructions in affected programs:     346282 -> 342880 (-0.98%)
helped:                                1552

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965/fs: Use overwrites_reg() instead of dst.equals().
Matt Turner [Wed, 19 Aug 2015 00:10:44 +0000 (17:10 -0700)]
i965/fs: Use overwrites_reg() instead of dst.equals().

equals() returns false for registers with different types, using it
isn't appropriate to determine whether an is overwriting a register.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Only consider fixed_hw_reg in equals() if file is HW_REG/IMM.
Matt Turner [Tue, 18 Aug 2015 21:28:03 +0000 (14:28 -0700)]
i965: Only consider fixed_hw_reg in equals() if file is HW_REG/IMM.

Noticed when debugging things that lead to the next patch.

On G45 (and presumably ILK) this helps register coalescing:

total instructions in shared programs: 4077373 -> 4077340 (-0.00%)
instructions in affected programs:     43751 -> 43718 (-0.08%)
helped:                                52
HURT:                                  2

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965/fs: Do not set the size for zero-size uniforms
Marta Lofstedt [Fri, 28 Aug 2015 08:22:41 +0000 (10:22 +0200)]
i965/fs: Do not set the size for zero-size uniforms

Zero sized uniforms can exist in the list, but they don't get get any space
allocated in prog_data->params or in the param_size array, so the size
should not be set for them.  This was previously fixed in:

commit: 781dc7c0e1f41502f18e07c0940af949a78d2792.

However,

commit: 259f7291de2387aa3ac5f856b39b7b934a1d8e7d

removed the fix.

Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
9 years agomesa: return old name for deleted samplers for SAMPLER_BINDING queries
Daniel Scharrer [Fri, 28 Aug 2015 09:45:36 +0000 (11:45 +0200)]
mesa: return old name for deleted samplers for SAMPLER_BINDING queries

If the sampler object has been deleted in the same context the binding
will have been cleared. If it has been deleted in another context, the
spec does not say what should returned. None of the other binding point
queries check for deletion in another context.

Also, as names of deleted objects are free for reuse, the current code
didn't even work reliably.

Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
9 years agomesa: add missing queries for ARB_direct_state_access
Daniel Scharrer [Fri, 28 Aug 2015 09:45:35 +0000 (11:45 +0200)]
mesa: add missing queries for ARB_direct_state_access

This adds index queries (glGet*i_v) for GL_TEXTURE_BINDING_* and
GL_SAMPLER_BINDING, as well as textue queries
(glGetTex{,ture}Parameter*) for GL_TEXTURE_TARGET.

CC: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
9 years agodocs: Fix a typo in GL3.txt concerning GL_KHR_context_flush_control
Neil Roberts [Fri, 28 Aug 2015 13:29:22 +0000 (14:29 +0100)]
docs: Fix a typo in GL3.txt concerning GL_KHR_context_flush_control

9 years agomesa: fix dispatch sanity with GL_OES_texture_storage_multisample_2d_array
Ilia Mirkin [Fri, 28 Aug 2015 06:50:25 +0000 (02:50 -0400)]
mesa: fix dispatch sanity with GL_OES_texture_storage_multisample_2d_array

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91785
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Matt Turner <mattst88@gmail.com>
9 years agoABI-check: Use more portable bash invocation.
Vinson Lee [Tue, 21 Jul 2015 21:02:01 +0000 (14:02 -0700)]
ABI-check: Use more portable bash invocation.

Fixes 'make check' on FreeBSD.

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
9 years agoi965/nir: Make use of nir_opt_undef
Boyan Ding [Fri, 21 Aug 2015 13:42:45 +0000 (21:42 +0800)]
i965/nir: Make use of nir_opt_undef

Shader-db result on Ivy Bridge:
total instructions in shared programs: 145484 -> 145445 (-0.03%)
instructions in affected programs:     225 -> 186 (-17.33%)
helped:                                5
HURT:                                  0

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
9 years agoglapi: Remove _x86_64_get_get_dispatch symbol from x86-64 assembly.
Matt Turner [Thu, 25 Sep 2014 18:49:48 +0000 (11:49 -0700)]
glapi: Remove _x86_64_get_get_dispatch symbol from x86-64 assembly.

Never used.

Reviewed-by: Mark Janes <mark.a.janes@intel.com>
9 years agoglsl: clean up textureSize prototype
Ilia Mirkin [Wed, 12 Aug 2015 15:55:53 +0000 (11:55 -0400)]
glsl: clean up textureSize prototype

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
9 years agor600g/sb: Don't crash on empty if jump target
Glenn Kennard [Thu, 27 Aug 2015 17:04:17 +0000 (19:04 +0200)]
r600g/sb: Don't crash on empty if jump target

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600g/sb: Don't read junk after EOP
Glenn Kennard [Thu, 27 Aug 2015 17:04:16 +0000 (19:04 +0200)]
r600g/sb: Don't read junk after EOP

Shaders that contain instruction data after an instruction with EOP could end
up parsing that as an instruction, leading to various crashes and asserts in
SB as it gets very confused if it sees for instance a loop start instruction
jumping off to some random point.

Add a couple of asserts, and print EOP bit if set in old asm printer.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agor600g/sb: Handle undef in read port tracker
Glenn Kennard [Thu, 27 Aug 2015 17:04:15 +0000 (19:04 +0200)]
r600g/sb: Handle undef in read port tracker

e8e443 missed adding check for undef values also in
unreserve function, leading to an assert triggering.

Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agomesa: rename rowStride to imageStride in texturesubimage()
Brian Paul [Thu, 27 Aug 2015 20:33:40 +0000 (14:33 -0600)]
mesa: rename rowStride to imageStride in texturesubimage()

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agomesa: only copy the requested teximage faces
Ilia Mirkin [Thu, 27 Aug 2015 19:28:24 +0000 (15:28 -0400)]
mesa: only copy the requested teximage faces

Cube maps are special in that they have separate teximages for each
face. We handled that by copying the data to them separately, but in
case zoffset != 0 or depth != 6 we would read off the end of the client
array or modify the wrong images.

zoffset/depth have already been verified by the time the code gets to
this stage, so no need to double-check.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
9 years agonir: Convert the builder to use the new NIR cursor API.
Kenneth Graunke [Thu, 6 Aug 2015 14:16:07 +0000 (07:16 -0700)]
nir: Convert the builder to use the new NIR cursor API.

The NIR cursor API is exactly what we want for the builder's insertion
point.  This simplifies the API, the implementation, and is actually
more flexible as well.

This required a bit of reworking of TGSI->NIR's if/loop stack handling;
we now store cursors instead of cf_node_lists, for better or worse.

v2: Actually move the cursor in the after_instr case.
v3: Take advantage of nir_instr_insert (suggested by Connor).
v4: vc4 build fixes (thanks to Eric).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> [v4]
Acked-by: Connor Abbott <cwabbott0@gmail.com> [v4]
9 years agonir: Convert the NIR instruction insertion API to use cursors.
Kenneth Graunke [Mon, 10 Aug 2015 01:30:33 +0000 (18:30 -0700)]
nir: Convert the NIR instruction insertion API to use cursors.

This patch implements a general nir_instr_insert() function that takes a
nir_cursor for the insertion point.  It then reworks the existing API to
simply be a wrapper around that for compatibility.

This largely involves moving the existing code into a new function.

Suggested by Connor Abbott.

v2: Make the legacy functions static inline in nir.h (requested by
    Connor Abbott).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir: Move nir_cursor to nir.h.
Kenneth Graunke [Tue, 25 Aug 2015 17:01:31 +0000 (10:01 -0700)]
nir: Move nir_cursor to nir.h.

We want to use this for normal instruction insertion too, not just
control flow.  Generally these functions are going to be extremely
useful when working with NIR, so I want them to be widely available
without having to include a separate file.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Connor Abbott <cwabbott0@gmail.com>
9 years agonir: Strengthen "no jumps" assertions in instruction insertion API.
Kenneth Graunke [Tue, 25 Aug 2015 00:30:08 +0000 (17:30 -0700)]
nir: Strengthen "no jumps" assertions in instruction insertion API.

Jumps must be the last instruction in a block, so inserting another
instruction after a jump is illegal.

Previously, we only checked this when the new instruction being inserted
was a jump.  This is a red herring - inserting *any* kind of instruction
after a jump is illegal.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Connor Abbott <cwabbott0@gmail.com>
9 years agost/mesa: use PROGRAM_ARRAY for storing structs containing arrays
Brian Paul [Wed, 26 Aug 2015 19:58:23 +0000 (13:58 -0600)]
st/mesa: use PROGRAM_ARRAY for storing structs containing arrays

Previously, we used PROGRAM_ARRAY only for variables which were
arrays or matrices.  But if the variable is a structure containing
an array or matrix, we need to use PROGRAM_ARRAY for that too.

Before, we failed an assertion:
  state_tracker/st_glsl_to_tgsi.cpp:4900:
  Assertion `src_reg->file != PROGRAM_TEMPORARY' failed.
when running the piglit test
glsl-1.20/execution/fs-const-array-of-struct-of-array.shader_test

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
9 years agoglsl: fix comment typo: s/filed/field/
Brian Paul [Wed, 26 Aug 2015 19:23:47 +0000 (13:23 -0600)]
glsl: fix comment typo: s/filed/field/

9 years agogallium/util: fix code formatting in u_blitter.h
Brian Paul [Tue, 25 Aug 2015 19:25:31 +0000 (13:25 -0600)]
gallium/util: fix code formatting in u_blitter.h

Trivial.

9 years agoi965/fs: Split VGRFs after lowering pull constants
Jason Ekstrand [Wed, 19 Aug 2015 21:29:53 +0000 (14:29 -0700)]
i965/fs: Split VGRFs after lowering pull constants

The split_virtual_grfs code doesn't properly rewrite reladdr so we need to
make sure that any uniform indirects are lowered away first.

This fixes the glsl-fs-uniform-indexed-by-swizzled-vec4.shader_test in piglit

Cc: "10.6" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi964/fs: Refactor assign_constant_locations
Jason Ekstrand [Wed, 19 Aug 2015 00:40:02 +0000 (17:40 -0700)]
i964/fs: Refactor assign_constant_locations

Now that all constant locations are assigned in a single function, we can
refactor it a bit to unify things.  In particular, we now handle
pull_constant_loc and push_constant_loc more similarly and we only modify
stage_prog_data->params[] in one place at the end of the function.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
9 years agoi965: Rename INTEL_DEBUG=vec4vs to INTEL_DEBUG=vec4.
Kenneth Graunke [Tue, 25 Aug 2015 23:17:14 +0000 (16:17 -0700)]
i965: Rename INTEL_DEBUG=vec4vs to INTEL_DEBUG=vec4.

driParseDebugString() doesn't have actual code to parse comma separated
lists (or any other supported options?); instead it dumbly uses strstr().

This means that INTEL_DEBUG="vec4vs" will trigger both DEBUG_VEC4VS and
DEBUG_VS, as "vs" is also a substring.

We should probably improve the driconf parsing, but for now, just rename
the option so it's usable in the meantime.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Jason Ekstrand <jason.ekstrand@intel.com>
Acked-by: Kristian Høgsberg <krh@bitplanet.net>
9 years agomesa: enable enums for OES_texture_storage_multisample_2d_array
Tapani Pälli [Mon, 24 Aug 2015 07:09:52 +0000 (10:09 +0300)]
mesa: enable enums for OES_texture_storage_multisample_2d_array

v2: use _mesa_is_gles31(ctx) for verifying we are on ES 3.1,
    remove _es31 usage from get_hash_params.py

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agoglsl: add support for OES_texture_storage_multisample_2d_array
Tapani Pälli [Fri, 21 Aug 2015 06:42:10 +0000 (09:42 +0300)]
glsl: add support for OES_texture_storage_multisample_2d_array

v2: use ARB_texture_multisample enable bit

Patch adds extension enable bit and enables required keywords
and builtin functions for the extension.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agomesa: Add extension enable for OES_texture_storage_multisample_2d_array
Tapani Pälli [Fri, 21 Aug 2015 06:40:11 +0000 (09:40 +0300)]
mesa: Add extension enable for OES_texture_storage_multisample_2d_array

v2: use ARB_texture_multisample bit to enable extension

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agoglapi: add GL_OES_texture_storage_multisample_2d_array extension
Tapani Pälli [Fri, 21 Aug 2015 06:43:27 +0000 (09:43 +0300)]
glapi: add GL_OES_texture_storage_multisample_2d_array extension

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agoswrast: add a new macro, FETCH_COMPRESSED
Nanley Chery [Sun, 31 May 2015 20:29:41 +0000 (13:29 -0700)]
swrast: add a new macro, FETCH_COMPRESSED

This patch creates a new macro, FETCH_COMPRESSED - similar in nature
to the other FETCH_* macros. This reduces repetition in the code that
deals with compressed textures.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agomesa: return bool instead of GLboolean in compressedteximage_only_format()
Nanley Chery [Thu, 18 Jun 2015 00:14:40 +0000 (17:14 -0700)]
mesa: return bool instead of GLboolean in compressedteximage_only_format()

In agreement with the coding style, functions that aren't directly visible
to the GL API should prefer the use of bool over GLboolean.

Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agoi965: refactor miptree alignment calculation code
Nanley Chery [Thu, 28 May 2015 23:02:34 +0000 (16:02 -0700)]
i965: refactor miptree alignment calculation code

Remove redundant checks and comments by grouping our calculations for
align_w and align_h wherever possible.

v2: reintroduce brw.
    don't include functional changes.
    don't adjust function parameters or create a new function.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agoi965: change the meaning of cpp for compressed textures
Nanley Chery [Thu, 21 May 2015 21:27:55 +0000 (14:27 -0700)]
i965: change the meaning of cpp for compressed textures

An ASTC block takes up 16 bytes for all block width and height configurations.
This size is not integrally divisible by all ASTC block widths. Therefore cpp
is changed to mean bytes per block if the texture is compressed.

Because the original definition was bytes per block divided by block width, all
references to the mipmap width must be divided the block width. This keeps the
address calculation formulas consistent. For example, the units for miptree_level
x_offset and miptree total_width has changed from pixels to blocks.

v2: reuse preexisting ALIGN_NPOT macro located in an i965 driver file.
v3: move ALIGN_NPOT into seperate commit.
    simplify cpp assignment in copy_image_with_blitter().
    update miptree width and offset variables in: intel_miptree_copy_slice(),
        intel_miptree_map_gtt(), and brw_miptree_layout_texture_3d().

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agoi965: correct mt->align_h for 2D textures on Skylake
Nanley Chery [Thu, 18 Jun 2015 18:02:17 +0000 (11:02 -0700)]
i965: correct mt->align_h for 2D textures on Skylake

In agreement with commit 4ab8d59a23, vertical alignment values are equal to
four times the block height on Gen9+.

v2: add newlines to separate declarations, statments, and comments.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agoi965: use ALIGN_NPOT for setting ASTC mipmap layouts
Nanley Chery [Thu, 21 May 2015 21:27:55 +0000 (14:27 -0700)]
i965: use ALIGN_NPOT for setting ASTC mipmap layouts

ALIGN is changed to ALIGN_NPOT because alignment values are sometimes not
powers of two when working with ASTC.

v2: handle texture arrays and LDR-only systems.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agomesa/macros: move ALIGN_NPOT to macros.h
Nanley Chery [Tue, 2 Jun 2015 18:03:22 +0000 (11:03 -0700)]
mesa/macros: move ALIGN_NPOT to macros.h

Aligning with a non-power-of-two number is a general task that can be used in
various places. This commit is required for the next one.

v2: add greater than 0 assertion (Anuj).
    convert the macro to a static inline function.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agomesa/macros: add power-of-two assertions for alignment macros
Nanley Chery [Wed, 27 May 2015 20:25:30 +0000 (13:25 -0700)]
mesa/macros: add power-of-two assertions for alignment macros

ALIGN and ROUND_DOWN_TO both require that the alignment value passed
into the macro be a power of two in the comments. Using software assertions
verifies this to be the case.

v2: use static inline functions instead of gcc-specific statement expressions (Brian).
v3: fix indendation (Brian).
v4: add greater than zero requirement (Anuj).

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agoi965/surface_formats: add support for 2D ASTC surface formats
Nanley Chery [Wed, 15 Apr 2015 21:15:10 +0000 (14:15 -0700)]
i965/surface_formats: add support for 2D ASTC surface formats

Define two-thirds of the 2D Intel ASTC surface formats (LDR-only). This allows
a 1-to-1 mapping from the mesa format to the Intel format.

ASTC textures will default to being processed in LDR mode. If there is
hardware support for HDR/Full mode and the texture is not sRGB, add the
format bit necessary to process it in HDR/Full mode.

v2: remove extra newlines.
v3: follow existing coding style in translate_tex_format().
v4: expound on the GEN9_SURFACE_ASTC_HDR_FORMAT_BIT comment.
    update SF table - ASTC is actually supported in Gen8.
v5: conform the ASTC MESA_FORMAT enums to the existing naming convention.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agomesa/teximage: return the base internal format of the ASTC formats
Nanley Chery [Tue, 28 Apr 2015 22:10:11 +0000 (15:10 -0700)]
mesa/teximage: return the base internal format of the ASTC formats

This is necesary to initialize the gl_texture_image struct.

From the KHR_texture_compression_astc_ldr spec:
  "Added to Section 3.8.6, Compressed Texture Images

   Add the tokens specified above to Table 3.16, Compressed Internal Formats.
   In all cases, the base internal format will be RGBA. The encoding allows
   images to be encoded with fewer channels, but this is always presented as
   RGBA to the sampler."

v2. use _mesa_is_astc_format().

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agomesa/teximage: accept ASTC formats for 3D texture specification
Nanley Chery [Mon, 27 Jul 2015 23:09:09 +0000 (16:09 -0700)]
mesa/teximage: accept ASTC formats for 3D texture specification

The ASTC spec was revised as follows:

   Revision 2, April 28, 2015 - added CompressedTex{Sub,}Image3D to
   commands accepting ASTC format tokens in the New Tokens section [...].

Support only exists in the HDR submode:

   Add a second new column "3D Tex." which is empty for all non-ASTC
   formats. If only the LDR profile is supported by the implementation,
   this column is also empty for all ASTC formats. If both the LDR and HDR
   profiles are supported only, this column is checked for all ASTC
   formats.

LDR-only systems should generate an INVALID_OPERATION error when
attempting to call CompressedTexImage3D with the TEXTURE_3D target.

v2. return the proper error for LDR-only systems.
v3. update is_astc_format().
v4. use _mesa_is_astc_format().
v5. place logic in _mesa_target_can_be_compressed.
v6. fix issues handling ASTC formats.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agomesa/texcompress: enable translation between MESA and GL ASTC formats
Nanley Chery [Tue, 28 Apr 2015 22:08:32 +0000 (15:08 -0700)]
mesa/texcompress: enable translation between MESA and GL ASTC formats

v3. conform the ASTC MESA_FORMAT enums to the existing naming convention.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agomesa/glformats: recognize ASTC formats as compressed
Nanley Chery [Tue, 19 May 2015 22:41:56 +0000 (15:41 -0700)]
mesa/glformats: recognize ASTC formats as compressed

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agomesa: add ASTC extensions to the extensions table
Nanley Chery [Tue, 19 May 2015 22:41:28 +0000 (15:41 -0700)]
mesa: add ASTC extensions to the extensions table

v2: alphabetize the extensions.
    remove OES ASTC extension.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agomesa: don't enable online compression for ASTC formats
Nanley Chery [Mon, 18 May 2015 23:30:30 +0000 (16:30 -0700)]
mesa: don't enable online compression for ASTC formats

In agreement with the ASTC spec, this makes calls to TexImage*D unsuccessful.
Implied by the spec, Generate[Texture]Mipmap and [Copy]Tex[Sub]Image*D calls
must be unsuccessful as well.

v2. actually force attempts to compress online to fail.
v3. indentation (Matt).
v4. update copytexture_error_check to account for CopyTexImage*D (Chad).

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agoglapi: add support for KHR_texture_compression_astc_ldr
Nanley Chery [Tue, 28 Apr 2015 21:41:49 +0000 (14:41 -0700)]
glapi: add support for KHR_texture_compression_astc_ldr

v2: correct the spelling of the sRGB variants.
    remove spaces around "=" when setting the enum value.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agomesa/formats: define the 2D ASTC formats
Nanley Chery [Tue, 19 May 2015 17:35:39 +0000 (10:35 -0700)]
mesa/formats: define the 2D ASTC formats

Define the mesa formats and make changes necessary for compilation
without errors. Also add support for _mesa_get_srgb_format_linear().

v2. conform the ASTC MESA_FORMAT enums to the existing naming convention.
v3. remove ASTC cases for _mesa_get_uncompressed_format(). This function is
    only used for generating mipmaps - something ASTC formats do not support
    due to lack of online compression.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agonouveau: avoid build failures since 0fc21ecf
Ilia Mirkin [Wed, 26 Aug 2015 18:04:03 +0000 (14:04 -0400)]
nouveau: avoid build failures since 0fc21ecf

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agogallium/radeon: read_registers should return bool meaning success or failure
Marek Olšák [Sat, 22 Aug 2015 12:17:10 +0000 (14:17 +0200)]
gallium/radeon: read_registers should return bool meaning success or failure

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agoradeonsi: add IB parser support for CP DMA packets
Marek Olšák [Wed, 19 Aug 2015 16:45:11 +0000 (18:45 +0200)]
radeonsi: add IB parser support for CP DMA packets

If the packet encoding is defined in the same format as register definitions,
the python script can process them automatically and the parser support
becomes trivial.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agoradeonsi: add IB tracing support for debug contexts
Marek Olšák [Wed, 19 Aug 2015 09:53:25 +0000 (11:53 +0200)]
radeonsi: add IB tracing support for debug contexts

This adds trace points to all IBs and the parser prints them and also
prints which trace points were reached (executed) by the CP.
This can help pinpoint a problematic packet, draw call, etc.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agoradeonsi: remove old CS tracing code
Marek Olšák [Mon, 17 Aug 2015 17:17:16 +0000 (19:17 +0200)]
radeonsi: remove old CS tracing code

Some of it is left there and it will be re-used in the next commit.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agoradeonsi: parse and dump status registers on GPU hang
Marek Olšák [Sat, 15 Aug 2015 22:54:34 +0000 (00:54 +0200)]
radeonsi: parse and dump status registers on GPU hang

GPU hang detection must be enabled by setting: GALLIUM_DDEBUG=[timeout in ms]

This may print too much information that we might not understand yet,
but some of the bits are very useful.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agoradeonsi: add an IB parser
Marek Olšák [Sat, 15 Aug 2015 21:57:22 +0000 (23:57 +0200)]
radeonsi: add an IB parser

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agoradeonsi: save the contents of indirect buffers for debug contexts
Marek Olšák [Sat, 15 Aug 2015 10:46:17 +0000 (12:46 +0200)]
radeonsi: save the contents of indirect buffers for debug contexts

This will be used by the IB parser.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agoradeonsi: generate register and packet tables for an IB parser from sid.h
Marek Olšák [Sat, 15 Aug 2015 21:44:04 +0000 (23:44 +0200)]
radeonsi: generate register and packet tables for an IB parser from sid.h

This makes writing a good IB parser a lot easier.

It generates 2 tables:
- packet3 table
- register table with all registers, fields, and named values

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agoradeonsi: remove duplicated register definitions and instruction definitions
Marek Olšák [Sat, 15 Aug 2015 16:48:06 +0000 (18:48 +0200)]
radeonsi: remove duplicated register definitions and instruction definitions

Instruction encoding isn't needed in Mesa.

The border color address registers were duplicated.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agor600g,radeonsi: remove unused ill-formed register field definitions
Marek Olšák [Sat, 15 Aug 2015 16:43:27 +0000 (18:43 +0200)]
r600g,radeonsi: remove unused ill-formed register field definitions

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agoradeonsi: add an initial dump_debug_state implementation dumping shaders
Marek Olšák [Sat, 15 Aug 2015 21:56:22 +0000 (23:56 +0200)]
radeonsi: add an initial dump_debug_state implementation dumping shaders

This is usually called after a draw call.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agoradeonsi: allow si_dump_key to write to a file
Marek Olšák [Sat, 11 Jul 2015 11:13:07 +0000 (13:13 +0200)]
radeonsi: allow si_dump_key to write to a file

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agogallium/ddebug: new pipe for hang detection and driver state dumping (v2)
Marek Olšák [Sat, 4 Jul 2015 12:10:21 +0000 (14:10 +0200)]
gallium/ddebug: new pipe for hang detection and driver state dumping (v2)

v2: lots of improvements

This is like identity or trace, but simpler. It doesn't wrap most states.

Run with:
  GALLIUM_DDEBUG=1000 [executable]
where "executable" is the app and "1000" is in miliseconds, meaning that
the context will be considered hung if a fence fails to signal in 1000 ms.

If that happens, all shaders, context states, bound resources, draw
parameters, and driver debug information (if any) will be dumped into:
  /home/$username/dd_dumps/$processname_$pid_$index.

Note that the context is flushed after every draw/clear/copy/blit operation
and then waited for to find the exact call that hangs.

You can also do:
  GALLIUM_DDEBUG=always
to do the dumping after every draw/clear/copy/blit operation without
flushing and waiting.

Examples of driver states that can be dumped are:
- Hardware status registers saying which hw block is busy (hung).
- Disassembled shaders in a human-readable form.
- The last submitted command buffer in a human-readable form.

v2: drop pipe-loader changes, drop SConscript
    rename dd.h -> dd_pipe.h

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agogallium: add flags parameter to pipe_screen::context_create
Marek Olšák [Sat, 25 Jul 2015 16:40:59 +0000 (18:40 +0200)]
gallium: add flags parameter to pipe_screen::context_create

This allows creating compute-only and debug contexts.

Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agogallium: add an interface for dumping debug driver state
Marek Olšák [Sat, 11 Jul 2015 10:34:46 +0000 (12:34 +0200)]
gallium: add an interface for dumping debug driver state

Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
9 years agomesa: remove pointless es31 checks, fix indirect to only be in es31
Ilia Mirkin [Mon, 24 Aug 2015 15:34:42 +0000 (11:34 -0400)]
mesa: remove pointless es31 checks, fix indirect to only be in es31

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agomesa: uncomment checks in es31 computation, add texture_ms
Ilia Mirkin [Mon, 24 Aug 2015 13:35:04 +0000 (09:35 -0400)]
mesa: uncomment checks in es31 computation, add texture_ms

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
9 years agomesa: create multisample fallback textures like normal textures
Marek Olšák [Sun, 23 Aug 2015 22:22:37 +0000 (00:22 +0200)]
mesa: create multisample fallback textures like normal textures

This works if drivers upsample on upload (like all radeon ones do).
The alternative is an unexpected GL error from anything calling
_mesa_update_state and possibly other issues.

Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
9 years agoradeonsi: mark unreachable paths to avoid warnings
Grazvydas Ignotas [Tue, 18 Aug 2015 00:23:29 +0000 (03:23 +0300)]
radeonsi: mark unreachable paths to avoid warnings

Otherwise we get:
warning: 'num_user_sgprs' may be used uninitialized in this function
...

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
9 years agomesa: GetTexLevelParameter{if}v changes for OpenGL ES 3.1
Tapani Pälli [Tue, 28 Jul 2015 08:25:35 +0000 (11:25 +0300)]
mesa: GetTexLevelParameter{if}v changes for OpenGL ES 3.1

Patch refactors existing parameters check to first check common enums
between desktop GL and GLES 3.1 and modifies get_tex_level_parameter_image
to be compatible with enums specified in 3.1.

v2: remove extra is_gles31() checks (suggested by Ilia)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (v1)
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> (v1)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
9 years agomesa/es3.1: Allow GL_COMPUTE_WORK_GROUP_SIZE for OpenGL ES 3.1
Marta Lofstedt [Wed, 19 Aug 2015 13:30:33 +0000 (15:30 +0200)]
mesa/es3.1: Allow GL_COMPUTE_WORK_GROUP_SIZE for OpenGL ES 3.1

According to OpenGL ES specification section 7.12,
GL_COMPUTE_WORK_GROUP_SIZE, is supported by the
glGetProgramiv function.

Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agomesa/es3.1: Enable getting MAX_COMPUTE_WORK_GROUP_ values for OpenGL ES 3.1
Marta Lofstedt [Wed, 19 Aug 2015 13:33:21 +0000 (15:33 +0200)]
mesa/es3.1: Enable getting MAX_COMPUTE_WORK_GROUP_ values for OpenGL ES 3.1

According to the OpenGL ES 3.1 specification chapter 17, the
MAX_COMPUTE_WORK_GROUP_COUNT and MAX_COMPUTE_WORK_GROUP_SIZE
is available for glGetIntegeri_v.

Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
9 years agomesa/formats: pass correct parameter to _mesa_is_format_compressed
Dave Airlie [Wed, 26 Aug 2015 00:37:09 +0000 (10:37 +1000)]
mesa/formats: pass correct parameter to _mesa_is_format_compressed

commit 26c549e69d12e44e2e36c09764ce2cceab262a1b
Author: Nanley Chery <nanley.g.chery@intel.com>
Date:   Fri Jul 31 10:26:36 2015 -0700

    mesa/formats: remove compressed formats from matching function

caused a regression in my CTS testing, this looks like a clear
thinko.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
sSigned-off-by: Dave Airlie <airlied@redhat.com>

9 years agogallium/auxiliary: optimize rgb9e5 helper some more
Roland Scheidegger [Sun, 9 Aug 2015 00:50:10 +0000 (02:50 +0200)]
gallium/auxiliary: optimize rgb9e5 helper some more

I used this as some testing ground for investigating some compiler
bits initially (e.g. lrint calls etc.), figured I could do much better
in the end just for fun...
This is mathematically equivalent, but uses some tricks to avoid
doubles and also replaces some float math with ints. Good for another
performance doubling or so. As a side note, some quick tests show that
llvm's loop vectorizer would be able to properly vectorize this version
(which it failed to do earlier due to doubles, producing a mess), giving
another 3 times performance increase with sse2 (more with sse4.1), but this
may not apply to mesa.
No piglit change.

Acked-by: Marek Olšák <marek.olsak@amd.com>
9 years agogallium/auxiliary: optimize rgb9e5 helper a bit
Roland Scheidegger [Sun, 9 Aug 2015 00:03:33 +0000 (02:03 +0200)]
gallium/auxiliary: optimize rgb9e5 helper a bit

This code (lifted straight from the extension) was doing things the most
inefficient way you could think of.
This drops some of the more expensive float operations, in particular
- int-cast floors (pointless, values always positive)
- 2 raised to (signed) integers (replace with simple exponent manipulation),
  getting rid of a misguided comment in the process (implement with table...)
- float division (replace with mul of reverse of those exponents)
This is like 3 times faster (measured for float3_to_rgb9e5), though it depends
(e.g. llvm is clever enough to replace exp2 with ldexp whereas gcc is not,
division is not too bad on cpus with early-exit divs).
Note that keeping the double math for now (float x + 0.5), as the results may
otherwise differ.

Acked-by: Marek Olšák <marek.olsak@amd.com>
9 years agomesa/texgetimage: fix missing stencil check
Dave Airlie [Sun, 23 Aug 2015 23:52:12 +0000 (09:52 +1000)]
mesa/texgetimage: fix missing stencil check

GetTexImage can read to stencil8 but only from
a stencil or depthstencil textures.

This fixes a bunch of failures in CTS
GL33-CTS.gtf32.GL3Tests.packed_pixels

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
9 years agomesa/teximage: Add GL error parameter to _mesa_target_can_be_compressed
Nanley Chery [Fri, 21 Aug 2015 20:09:08 +0000 (13:09 -0700)]
mesa/teximage: Add GL error parameter to _mesa_target_can_be_compressed

Enables _mesa_target_can_be_compressed to return the appropriate GL error
depending on it's inputs. Use the parameter to return the appropriate GL error
for ETC2 formats on GLES3.

Suggested-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agomesa/formats: remove compressed formats from matching function
Nanley Chery [Fri, 31 Jul 2015 17:26:36 +0000 (10:26 -0700)]
mesa/formats: remove compressed formats from matching function

All compressed formats return GL_FALSE and there isn't any evidence to
support that this behaviour would change. Remove all switch cases for
compressed formats.

v2. Since the exhaustive switch is removed, add a gtest to ensure
    all formats are handled.
v3. Ensure that GL_NO_ERROR is set before returning.
v4. Fix an arg to _mesa_uncompressed_format_to_type_and_comps();
    fix formatting and misc improvements (Chad).

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agomesa/formats: make format testing a gtest
Nanley Chery [Tue, 18 Aug 2015 19:42:57 +0000 (12:42 -0700)]
mesa/formats: make format testing a gtest

We currently check that our format info table is sane during context
initialization in debug builds. Perform this check during
`make check` instead. This enables format testing in release builds
and removes the requirement of an exhuastive switch for
_mesa_uncompressed_format_to_type_and_comps().

v2. indentation and conditional inclusion fixes (Chad).
    allow tests to continue running if any format fails
    and display the failing format name.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
9 years agogallium/ttn: Use nir_builder_insert() rather than poking at cf_list.
Kenneth Graunke [Thu, 6 Aug 2015 14:44:35 +0000 (07:44 -0700)]
gallium/ttn: Use nir_builder_insert() rather than poking at cf_list.

I intend to remove nir_builder::cf_node_list, so I can't have this code
poking at it directly.  The proper way is to set the insertion point and
then simply insert things there.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
9 years agoprog_to_nir: Use nir_builder_insert() rather than poking at cf_list.
Kenneth Graunke [Thu, 6 Aug 2015 14:39:34 +0000 (07:39 -0700)]
prog_to_nir: Use nir_builder_insert() rather than poking at cf_list.

I intend to remove nir_builder::cf_node_list, so I can't have this code
poking at it directly.  The proper way is to set the insertion point and
then simply insert things there.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>