Jason Ekstrand [Tue, 21 Nov 2017 17:56:41 +0000 (09:56 -0800)]
anv/blorp: Rework image clear/resolve helpers
This replaces image_fast_clear and ccs_resolve with two new helpers that
simply perform an isl_aux_op whatever that may be on CCS or MCS. This
is a bit cleaner as it separates performing the aux operation from which
blorp helper we have to call to do it.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Tue, 21 Nov 2017 17:16:18 +0000 (09:16 -0800)]
intel/isl: Codify AUX operations in an enum
Right now, we have different entrypoints and enums in blorp for these
different operations. This provides us a central enum which we can
begin to transition to.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Gert Wollny [Thu, 8 Feb 2018 14:11:58 +0000 (15:11 +0100)]
r600/sb: Check whether optimizations would result in reladdr conflict
v2: * Check whether the node src and dst registers are NULL before using
them.
* fix a type in the commit message.
Two cases are handled with this patch:
1. If copy propagation tries to eliminated a move from a relative
array access then it could optimize
MOV R1, ARRAY[RELADDR_1]
MOV R2, ARRAY[RELADDR_2]
OP2 R3, R1 R2
into
OP2 R3, ARRAY[RELADDR_1], ARRAY[RELADDR_2]
which is forbidden, because there is only one address register available.
2. When MULADD(x,a,MUL(x,c)) is handled
MUL TMP, R1, ARRAY[RELADDR_1]
MULLADD R3, R1, ARRAY[RELADDR_2], TMP
by folding this into
ADD TMP, ARRAY[RELADDR_2], ARRAY[RELADDR_1]
MUL R3, R1, TMP
which is also forbidden.
Test for these cases and reject the optimization if a forbidden combination
of relative access would be created.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103142
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Glenn Kennard [Sun, 5 Mar 2017 17:26:54 +0000 (18:26 +0100)]
r600g: Implement spilling of temp arrays (v2)
Pessimistically spills arrays if GPR limit is exceeded.
v2: fix r600 support [airlied]
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 6 Feb 2018 04:17:46 +0000 (14:17 +1000)]
r600/sb: handle scratch mem reads on r600
On r600 we use the scratch mem with read/read_ind, in that case
sb should track the rw_gpr as a dst instead of a src.
This stops the whole shader being optimised out.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Glenn Kennard [Sun, 5 Mar 2017 17:26:53 +0000 (18:26 +0100)]
r600g/sb: Add dependency tracking for scratch ops
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Glenn Kennard [Sun, 5 Mar 2017 17:26:52 +0000 (18:26 +0100)]
r600g/sb: Support scratch ops
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Glenn Kennard [Sun, 5 Mar 2017 17:26:51 +0000 (18:26 +0100)]
r600g: Implement scratch buffer state management (v2)
v2: add Glenn's fixes
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Glenn Kennard [Sun, 5 Mar 2017 17:26:50 +0000 (18:26 +0100)]
r600g: Add pending output function
Spills have to happen after the VLIW bundle currently
processed, so defer emitting the spill op.
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Glenn Kennard [Sun, 5 Mar 2017 17:26:49 +0000 (18:26 +0100)]
r600g: Support emitting scratch ops
Signed-off-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 8 Feb 2018 06:19:28 +0000 (16:19 +1000)]
r600: fix texture gather swizzling.
This fixes:
KHR-GL45.texture_gather.swizzle
on cayman and redwood.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Timothy Arceri [Tue, 6 Feb 2018 03:38:57 +0000 (14:38 +1100)]
ac: add 64bit support to ac_find_lsb()
v2: use LLVMBuildTrunc()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Tue, 6 Feb 2018 03:38:19 +0000 (14:38 +1100)]
ac: move get_elem_bits() to ac_llvm_build.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Tue, 6 Feb 2018 03:34:55 +0000 (14:34 +1100)]
ac: add 64bit bitCount support
v2: use LLVMBuildTrunc()
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Thu, 8 Feb 2018 13:56:48 +0000 (14:56 +0100)]
ac/nir: clean up handle_fs_outputs_post()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 8 Feb 2018 13:56:47 +0000 (14:56 +0100)]
ac/nir: add radv_load_output() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 8 Feb 2018 13:56:46 +0000 (14:56 +0100)]
ac/shader: scan info about output PS declarations
NIR->LLVM should only be a translation pass, and all scan stuff
should be done before.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 8 Feb 2018 13:56:45 +0000 (14:56 +0100)]
ac/nir: add radv_export_param() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 8 Feb 2018 13:56:44 +0000 (14:56 +0100)]
ac/nir: remove set but unused export_mask
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 8 Feb 2018 13:56:43 +0000 (14:56 +0100)]
ac/nir: remove dead code in handle_vs_outputs_post()
The memcpy can't be reached because the condition is always false.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 8 Feb 2018 13:56:42 +0000 (14:56 +0100)]
ac/nir: remove useless check in si_llvm_init_export_args()
values can't be NULL because we use ac_build_export_null() now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 7 Feb 2018 18:09:13 +0000 (19:09 +0100)]
ac/nir: use ac_build_export_null()
The number of enabled channels should be 0 when exporting null.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 7 Feb 2018 18:09:12 +0000 (19:09 +0100)]
ac: add ac_build_export_null() helper
Imported from RadeonSI.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Scott D Phillips [Thu, 8 Feb 2018 00:55:24 +0000 (16:55 -0800)]
meson: Add build option for tools
Add a build option to control building some of the misc tools we
have. Also set the executables to install, presumably you want
that if you're asking for the build.
v2: set 'install:' to the with_tools value, not true (Jordan)
handle 'all' in a the comma list (Dylan)
Add freedreno's tools (Dylan)
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Anuj Phogat [Mon, 29 Jan 2018 18:42:17 +0000 (10:42 -0800)]
intel: Add Coffee Lake brand strings
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Brian Paul [Thu, 8 Feb 2018 17:00:21 +0000 (10:00 -0700)]
gallium/util: silence clang warning in blitter code
Silence "warning: comparison of constant
4294967295 with expression
of type 'ubyte'".
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Thu, 8 Feb 2018 16:54:52 +0000 (09:54 -0700)]
tgsi: s/unsigned/enum tgsi_semantic/ in ureg_DECL_output()
So the function matches the prototype. Found with clang.
v2: fix copy&paste error
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Thu, 8 Feb 2018 01:29:12 +0000 (18:29 -0700)]
tgsi: use TGSI_INTERPOLATE_x arguments instead of zeros in ureg code
TGSI_INTERPOLATE_CONSTANT and TGSI_INTERPOLATE_LOC_CENTER have the
value zero so there's no change in behavior. It seems funny to
declare these fs input registers with constant interpolation. But
it looks like ureg_DECL_input_layout() is not called anywhere and
ureg_DECL_input() is only called from
util_make_geometry_passthrough_shader().
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Thu, 8 Feb 2018 01:28:34 +0000 (18:28 -0700)]
gallium/util: s/uint/enum tgsi_semantic/ in simple shader code
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Thu, 8 Feb 2018 01:18:39 +0000 (18:18 -0700)]
tgsi: s/unsigned/enum pipe_shader_type/ in ureg code
And add a default switch case to silence a compiler warning.
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Wed, 7 Feb 2018 23:14:11 +0000 (16:14 -0700)]
gallium/util: s/uint/enum tgsi_semantic/ in u_blitter.c
And put static qualifier on const arrays.
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Wed, 7 Feb 2018 23:12:59 +0000 (16:12 -0700)]
st/mesa: s/unsigned/enum tgsi_semantic/ st_cb_drawpixels.c
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Wed, 7 Feb 2018 23:12:35 +0000 (16:12 -0700)]
vbo: add a comment on vbo_draw_transform_feedback()
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Wed, 7 Feb 2018 16:28:16 +0000 (09:28 -0700)]
gallium/util: trivial whitespace/formatting fixes in u_blit.c
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Tue, 6 Feb 2018 22:35:30 +0000 (15:35 -0700)]
vbo: improve comments on vbo_draw_func()
And rename a parameter name.
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Tue, 6 Feb 2018 22:33:37 +0000 (15:33 -0700)]
cso: add a couple sanity check assertions in cso_draw_vbo()
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Wed, 31 Jan 2018 22:01:09 +0000 (15:01 -0700)]
st/mesa: rename some vars related to indirect draw count
'indirect_params' was a bit vague. Use the names that we use in
gallium's pipe_draw_indirect_info.
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Marek Olšák [Thu, 16 Nov 2017 15:01:11 +0000 (16:01 +0100)]
st/mesa: remove out_num_textures from update_textures
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Marek Olšák [Wed, 15 Nov 2017 23:32:22 +0000 (00:32 +0100)]
st/mesa: don't store non-fragment sampler states and views in st_context
those are unused.
st_context: 10120 -> 3704 bytes
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Lionel Landwerlin [Tue, 6 Feb 2018 23:28:24 +0000 (23:28 +0000)]
i965: perf: cleanup detection of kernel support for loadable configs
The initial revision of the patch adding loadable configs was testing
the feature's availability by adding a new config successfully and
then removing it.
A second version tested the availability just by exercising the
removal. But some unused code remained.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Lionel Landwerlin [Tue, 6 Feb 2018 23:23:22 +0000 (23:23 +0000)]
i965: perf: use drmIoctl() instead of ioctl()
ioctl() might be interrupted, use drmIoctl() instead as it'll retry
automatically.
Fixes: 27ee83eaf7e "i965: perf: add support for userspace configurations"
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Lionel Landwerlin [Tue, 6 Feb 2018 17:00:58 +0000 (17:00 +0000)]
i965: perf: add debug messages for loaded configs
This helps figuring out potential problems when metrics don't show up
on frameretrace for example.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Dave Airlie [Thu, 8 Feb 2018 02:35:46 +0000 (12:35 +1000)]
r600: implement tg4 integer workaround. (v2)
This ports the texture gather integer workaround from radeonsi.
This fixes:
KHR-GL45.texture_gather.plain-gather-uint/int*
v2: add rect support, fix 2d array shadow
Reviewed-by: Roland Scheidegger <sroland@vmware.com> (on irc)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Glenn Kennard [Tue, 6 Feb 2018 20:24:07 +0000 (06:24 +1000)]
r600: clean up initial shader register setup
This is taken from Glenn Kennards scratch series, but separated
out as a cleanup by me.
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Roland Scheidegger [Sun, 4 Feb 2018 22:54:26 +0000 (23:54 +0100)]
r600: partly fix sampleMaskIn value
The hw gives us coverage for pixel, not for individual fragment shader
invocations, in case execution isn't per pixel (eg, unlike cm, actually
cannot do "real" minSampleShading, it's either per-pixel or per-fragment,
but it doesn't really make a difference here).
Also, with msaa disabled, the hw still gives us a mask corresponding to
the number of samples, where GL requires this to be 1.
Fix this up by masking the sampleMaskIn bits with the bit corresponding to
the sampleID, if we know this shader is always executed at per-sample
granularity. (In case of a per-sample frequency shader and msaa disabled,
the sampleID will always be 0, so this works just fine there.)
Fixing this for the minSampleShading case will need a shader key (radeonsi
uses the prolog part for) (for eg, could get away with a single bit, cm
would need more bits depending on sample/invocation ratio, or read the
bits from a uniform), unless we'd want to always use a sample mask uniform
(which is probably not a good idea, as it would make the ordinary common
msaa case slower for no good reason).
This fixes some parts of piglit arb_sample_shading-samplemask (with fixed
test), in particular those which use a sampleID, still failing others
as expected.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Roland Scheidegger [Sun, 4 Feb 2018 22:38:28 +0000 (23:38 +0100)]
r600: clean up fragment shader input scan code
For some reason, we were iterating through the code twice (first just for
instructions needing barycentrics, then for instructions and input dcls).
Move things around slightly so this is no longer necessary.
There also was a unnedeed enabling of the fixed_pt_position_gpr - this is only
needed if the per-sample interpolation comes from an input, not from an
instruction (just move the assert where it belongs) (since the sample id to
sample from comes from a tgsi src in this case, and isn't sampleID).
Otherwise there should be no functional change.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Roland Scheidegger [Sat, 3 Feb 2018 23:32:05 +0000 (00:32 +0100)]
mesa: (trivial) remove unused ignore_sample_qualifier_parameter
This parameter for _mesa_get_min_incations_per_fragment() was once used
by the intel driver, but it's long gone.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Dave Airlie <airlied@vmware.com>
Roland Scheidegger [Sat, 3 Feb 2018 19:11:35 +0000 (20:11 +0100)]
r600/cm: (trivial) code cleanup for emitting msaa state
No functional change (compile tested only).
Reviewed-by: Dave Airlie <airlied@redhate.com>
Brian Paul [Wed, 7 Feb 2018 05:17:10 +0000 (22:17 -0700)]
tgsi: use tgsi_semantic enum type in ureg code
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Wed, 7 Feb 2018 05:16:41 +0000 (22:16 -0700)]
st/mesa: use tgsi_semantic enum type
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Wed, 7 Feb 2018 05:11:41 +0000 (22:11 -0700)]
tgsi: use TGSI enum types in ureg code
v2: fix enum tgsi_interpolate_mode/loc typo.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Wed, 7 Feb 2018 05:10:59 +0000 (22:10 -0700)]
st/mesa: use TGSI enum types in st_glsl_to_tgsi.cpp
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Wed, 7 Feb 2018 04:55:10 +0000 (21:55 -0700)]
gallium/util: replace uint with tgsi enum types
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Wed, 7 Feb 2018 04:54:38 +0000 (21:54 -0700)]
gallium/util: replace unsigned with tgsi enum types
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Fredrik Höglund [Thu, 25 Jan 2018 17:12:14 +0000 (18:12 +0100)]
radv: implement VK_EXT_external_memory_host
Ported from the radeonsi GL_AMD_pinned_memory implementation.
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Dave Airlie [Wed, 7 Feb 2018 22:12:36 +0000 (08:12 +1000)]
r600: fix rendering regression on r6/7 gpus
Fixes: 2d5b5d267e (r600: work out target mask at framebuffer bind.)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104989
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Grazvydas Ignotas [Sat, 3 Feb 2018 21:54:28 +0000 (23:54 +0200)]
radeonsi: avoid int-to-pointer-cast warnings on 32bit
I hope the actual dropping of MSB is ok, but that's what's already
happened before this change.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Grazvydas Ignotas [Sat, 3 Feb 2018 21:42:28 +0000 (23:42 +0200)]
gallium/hud: update some query functions
It seems these were missed when struct pipe_context * argument was
added to hud_graph::query_new_value.
Fixes: 3132afdf4c "gallium/hud: pass pipe_context explicitly to most functions"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Roland Scheidegger [Wed, 7 Feb 2018 22:47:39 +0000 (23:47 +0100)]
Revert "gallium: build ddebug, noop, rbug, trace as part of auxiliary"
This reverts commit
6f82b8d8d0a986aac28e7bec47fc313fb950475c.
This broke scons build, and reportedly clover with autotools/meson too.
Marek Olšák [Mon, 4 Sep 2017 20:36:34 +0000 (22:36 +0200)]
gallium: build ddebug, noop, rbug, trace as part of auxiliary
Building gallium is faster by 7.5 seconds on a 4core/8thread 3GHz CPU.
(gallium build time is reduced by 15% when building only radeonsi)
Non-recursive makefiles are great!
Roland Scheidegger [Wed, 7 Feb 2018 21:02:54 +0000 (22:02 +0100)]
u_blit: (trivial) fix bogus argument order for set_fragment_shader
Amazingly this still worked sometimes, albeit I'm not even sure why...
This fixes
d7bec6f7a6a2a35c80be939db8532011af1e9b67.
Andres Rodriguez [Wed, 7 Feb 2018 19:38:52 +0000 (14:38 -0500)]
mesa: fix incorrect type when allocating arrays
The array members are have type 'struct gl_buffer_object *'
Found by coverity.
Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Roland Scheidegger [Wed, 7 Feb 2018 04:18:17 +0000 (05:18 +0100)]
u_blit,u_simple_shaders: add shader to convert from xrbias format
We need this to handle some oddball dx10 format
(DXGI_FORMAT_R10G10B10_XR_BIAS_A2_UNORM). What you can do with this
format is very limited, hence we don't want to add it as a gallium
format (we could not express the properties of this format as
ordinary format properties neither, so like all special formats
it would need specific code for handling it in any case).
While here, also nuke the array for different shaders for different
writemasks, as it was not actually used (always full masks are
passed in for generating shaders).
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Roland Scheidegger [Wed, 7 Feb 2018 04:03:42 +0000 (05:03 +0100)]
u_simple_shaders: fix mask handling in util_make_fragment_tex_shader_writemask
The writemask handling was busted, since writing defaults to output
meant they got overwritten by the tex sampling anyway. Albeit the
affected components were undefined, so maybe with some luck it
still would have worked with some drivers - if not could as well
kill it... (This would have affected u_blitter but not u_blit since
the latter always used xyzw mask.)
Reviewed-by: Brian Paul <brianp@vmware.com>
Bas Nieuwenhuizen [Fri, 2 Feb 2018 15:59:23 +0000 (16:59 +0100)]
autotools: Only build libmesa-st-tests-common.a for tests.
We don't need the library if we don't build tests, and building
it adds a dependency on gtest which adds a dependency on cxxabi.h.
Fixes: 6569b33b6e "mesa/st/tests: unify MockCodeLine* classes"
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Tapani Pälli [Wed, 17 Jan 2018 09:43:59 +0000 (11:43 +0200)]
i965: add __DRI2_BLOB support and set cache functions
v2: adjust to change that moved cache from ctx to screen
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tapani Pälli [Wed, 7 Feb 2018 06:13:00 +0000 (08:13 +0200)]
disk cache: add callback functionality
v2: add disk_cache_has_key, disk_cache_put_key support
using blob cache (Nicolai, Jordan)
v3: rename set_cb as put_cb to match existing naming (Timothy)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tapani Pälli [Mon, 22 Jan 2018 09:55:06 +0000 (11:55 +0200)]
disk cache: initialize cache path and index only when used
This patch makes disk_cache initialize path and index lazily so
that we can utilize disk_cache without a path using callback
functionality introduced by next patch.
v2: unmap mmap and destroy queue only if index_mmap exists
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tapani Pälli [Thu, 1 Feb 2018 07:08:52 +0000 (09:08 +0200)]
glsl/tests: changes to test_disk_cache_create test
Next patch will allow disk_cache instance to be created without
path set for it, modify some test cases that assume disk_cache
creation to fail with invalid path. Creation should succeed but
simple put/get test fail.
v2: leave tests as is but check that both cache struct exists
and try simple put/get that should fail with invalid path set
(Emil)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tapani Pälli [Tue, 30 Jan 2018 09:42:55 +0000 (11:42 +0200)]
glsl/tests: move utility functions in cache_test
Patch moves functions higher so that we can utilize them from
test_disk_cache_create which is modified by next patch.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tapani Pälli [Thu, 28 Dec 2017 08:51:11 +0000 (10:51 +0200)]
egl: add support for EGL_ANDROID_blob_cache
v2: cleanup, move callbacks to _egl_display struct (Emil Velikov)
adapt to earlier ctx->screen changes
v3: remove useless checking, add _eglSetFuncName (Emil Velikov)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v2)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tapani Pälli [Thu, 28 Dec 2017 07:27:24 +0000 (09:27 +0200)]
dri: add interface for EGL_ANDROID_blob_cache extension
v2: move from __DRIcontext to __DRIscreen (Emil Velikov)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Samuel Pitoiset [Tue, 6 Feb 2018 21:06:11 +0000 (22:06 +0100)]
ac/nir: use new pknorm_i16/u16 and pk_i16/u16 LLVM intrinsics
Ported from RadeonSI.
Only one F1 2017 shader is affected, code size decreased
from 532 to 488 on both Polaris10 and Vega10.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 1 Feb 2018 10:35:06 +0000 (11:35 +0100)]
ac/nir: avoid loading unused VS input components
Polaris10:
Totals from affected shaders:
SGPRS: 122840 -> 120984 (-1.51 %)
VGPRS: 78812 -> 78440 (-0.47 %)
Spilled SGPRs: 177 -> 129 (-27.12 %)
Code Size:
2950028 ->
2941276 (-0.30 %) bytes
Max Waves: 17899 -> 17976 (0.43 %)
Vega10:
Totals from affected shaders:
SGPRS: 117144 -> 115776 (-1.17 %)
VGPRS: 77580 -> 77532 (-0.06 %)
Spilled SGPRs: 0 -> 152 (0.00 %)
Code Size:
3352656 ->
3347860 (-0.14 %) bytes
Max Waves: 19756 -> 19866 (0.56 %)
This increases SGPRs spilling a bit with Talos, but I have
some other ideas that might reduce it.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Thu, 1 Feb 2018 10:32:32 +0000 (11:32 +0100)]
ac/shader: scan vertex inputs usage mask
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Iago Toral Quiroga [Thu, 4 Jan 2018 02:55:13 +0000 (03:55 +0100)]
i965: allocate a SGVS element when VertexID or InstanceID are read
Although on gen8+ platforms we can in theory use 3DSTATE_VF_SGVS
to put these beyond the last vertex element it seems that we still
need to allocate the SVGS element, otherwise we have observed cases
where we end up reading garbage. Specifically, the CTS test mentioned
below was flaky with a fail rate of ~1% on some gen9+ platforms caused
by reading garbage for the gl_InstanceID value. The flakyness goes
away as soon as we start allocating the SVGS element.
v2:
- Do this for gen8+, not just gen9+, and pull the boolean
outside the #if block (Jason)
Fixes flaky test:
KHR-GL45.vertex_attrib_64bit.limits_test
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104335
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Dylan Baker [Mon, 20 Nov 2017 22:55:13 +0000 (14:55 -0800)]
glapi: fix check_table test for non-shared glapi with meson
v2: - Add glapitable_h generated source to requirements
Fixes: 3218056e0eb3 ("meson: Build i965 and dri stack")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com> (v1)
Dylan Baker [Mon, 20 Nov 2017 22:54:30 +0000 (14:54 -0800)]
glapi: Don't search through subdirs from glapitable.h
Because meson won't put it in that folder.
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Dylan Baker [Mon, 20 Nov 2017 22:53:09 +0000 (14:53 -0800)]
state_tracker: Don't build st-renumerate-test without shared glapi
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Dylan Baker [Mon, 20 Nov 2017 22:39:27 +0000 (14:39 -0800)]
glapi: remove APPLE extensions from test
Fixes: 7009955281260fbb ("mesa: Remove GL_APPLE_vertex_array_object stubs")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
Dylan Baker [Mon, 20 Nov 2017 18:05:25 +0000 (10:05 -0800)]
glapi/check_table: Remove 'extern "C"' block
Using 'extern "C"' around includes is always incorrect, as the header may
contain C++ symbols (as it does in this case), which means it cannot use
C linkage. In this case the header has a template in it, which obviously
cannot be linked with C linkage rules.
Fixes: a29ad2b421b75a1727b ("mesa/tests: Add tests for the generated dispatch table")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Dylan Baker [Wed, 8 Nov 2017 00:00:34 +0000 (16:00 -0800)]
meson: fix test source name for static glapi
fixes:
43a6e84927e3 ("meson: build mesa test.")
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Dylan Baker [Tue, 7 Nov 2017 17:40:06 +0000 (09:40 -0800)]
glapi: don't walk backwards for includes
Instead just set the proper -I flags and include it from a more standard
path. In this case we'll add -Isrc/mesa (which is common), and #include
main/foo.h.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Brian Paul [Mon, 5 Feb 2018 16:33:58 +0000 (09:33 -0700)]
mesa: rename gl_vertex_array_object::_VertexAttrib -> _VertexArray
Since the type is gl_vertex_array. Update comment to explain that
these arrays are only used by the VBO module.
Also rename some local variables in _mesa_update_vao_derived_arrays().
Reviewed-by: Mathias Fröhlich <mathias.froehlich@web.de>
Brian Paul [Tue, 6 Feb 2018 22:12:58 +0000 (15:12 -0700)]
mesa: minor whitespace fixes, line wrapping in texcompress.c
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Brian Paul [Tue, 6 Feb 2018 16:55:13 +0000 (09:55 -0700)]
mesa: simplify _mesa_get_compressed_formats()
Instead of testing for formats==NULL everywhere, just point formats at
a dummy array which will be discarded.
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Vlad Golovkin [Tue, 6 Feb 2018 13:48:00 +0000 (06:48 -0700)]
util: remove redundant check for the __clang__ macro
Clang defines __GNUC__ macro, so one doesn't need to check __clang__
macro in this particular case.
v2: added comment as per Brian Paul's suggestion
Reviewed-by: Brian Paul <brianp@vmware.com>
Brian Paul [Fri, 2 Feb 2018 16:21:44 +0000 (09:21 -0700)]
st/mesa: use st_access_flags_to_transfer_flags() helper in more places
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Brian Paul [Fri, 2 Feb 2018 15:54:19 +0000 (08:54 -0700)]
st/mesa: refactor st_bufferobj_map_range()
Use a new helper function, st_access_flags_to_transfer_flags(), to
convert the GL_MAP_x flags to PIPE_TRANSFER_x flags.
We'll be able to use this function in a couple other places.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Brian Paul [Fri, 2 Feb 2018 15:38:50 +0000 (08:38 -0700)]
st/mesa: refactor bufferobj_data()
Split out some of the code into three new helper functions:
buffer_target_to_bind_flags(), storage_flags_to_buffer_flags(),
buffer_usage() to make the code more managable.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Mon, 29 Jan 2018 16:19:18 +0000 (17:19 +0100)]
radv: run nir_opt_shrink_load
LLVM can't shrink loads.
Polaris10:
Totals from affected shaders:
SGPRS: 62528 -> 59955 (-4.11 %)
VGPRS: 44708 -> 44616 (-0.21 %)
Spilled SGPRs: 16 -> 8 (-50.00 %)
Code Size:
1355504 ->
1355172 (-0.02 %) bytes
Max Waves: 11710 -> 11670 (-0.34 %)
Vega10:
Totals from affected shaders:
SGPRS: 51448 -> 50371 (-2.09 %)
VGPRS: 39140 -> 39048 (-0.24 %)
Spilled SGPRs: 16 -> 16 (0.00 %)
Code Size:
1307188 ->
1304296 (-0.22 %) bytes
Max Waves: 11312 -> 11292 (-0.18 %)
This reduces SGPRs spilling in MadMax, and it also reduces
number of SGPRs in DOW3 and F12017. The number of waves slightly
decreases in F1 but I don't see any performance changes after
benchmarking it. Talos and Serious Sam are not affected because
they don't use any push constants.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Mon, 29 Jan 2018 16:19:00 +0000 (17:19 +0100)]
nir: add nir_opt_shrink_load pass
This is a very simple pass that just shrinks load_push_constant
intrinsics when some components are unused. For now, it can just
shrink vec4 to vec3, vec3 to vec2 and so on.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Timothy Arceri [Thu, 1 Feb 2018 23:09:47 +0000 (10:09 +1100)]
radeonsi/nir: add nir support for compiling compute shaders
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Fri, 2 Feb 2018 03:33:06 +0000 (14:33 +1100)]
ac/radeonsi: add num_work_groups to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Fri, 2 Feb 2018 02:55:25 +0000 (13:55 +1100)]
ac: implement nir_intrinsic_shader_clock
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Fri, 2 Feb 2018 02:54:48 +0000 (13:54 +1100)]
ac/radeonsi: create ac_build_shader_clock() helper
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Fri, 2 Feb 2018 02:14:41 +0000 (13:14 +1100)]
ac/radeonsi: add load_local_group_size() to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Fri, 2 Feb 2018 02:06:02 +0000 (13:06 +1100)]
radeonsi: add get_block_size() helper
This will be reused by the nir backend in a later patch.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Thu, 1 Feb 2018 23:24:16 +0000 (10:24 +1100)]
ac: don't call emit_outputs() for compute
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Thu, 1 Feb 2018 23:23:46 +0000 (10:23 +1100)]
ac/radeonsi: add local_invocation_ids to the abi
Reviewed-by: Marek Olšák <marek.olsak@amd.com>