Aaron Watry [Thu, 10 Aug 2017 03:02:30 +0000 (22:02 -0500)]
clover: Allow overriding platform/device version numbers
Useful for testing API, builtin library, and device completeness of
not-yet-supported versions.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
(v3) Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Cc: Jan Vesely <jan.vesely@rutgers.edu>
v4: Remove redundant std::string wrapper around debug_get_option calls
v3: mark CL version overrides as static and const
v2: Make version_string in platform const in case
Aaron Watry [Sun, 6 Aug 2017 01:41:40 +0000 (20:41 -0500)]
clover/llvm: Pass device down to compile
We'll need to be able to detect device version to define the appropriate
__OPENCL_VERSION__ header.
v2: Rebase after removing the previous patch (Pierre)
- Removed "clover: Add device_clc_version to llvm::create_compiler_instance"
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Aaron Watry [Sun, 6 Aug 2017 01:18:48 +0000 (20:18 -0500)]
clover: Pass device to llvm::create_compiler_instance
We'll be using dev.device_clc_version to select the default language version
soon along with the existing ir_target field.
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
v4: Pass the device down instead of device_clc_version as a separate field
v3: Revise to acknowledge that we now have the device in compile/link_program
instead of the string values.
v2: (Pierre) Move changes to create_compiler_instance invocation to correct
patch to prevent temporary build breakage.
(Jan) Use device_clc_version instead of device_version for compile/link
Aaron Watry [Sat, 10 Feb 2018 20:03:13 +0000 (14:03 -0600)]
clover/llvm: Use device in llvm compilation instead of copying fields
Copying the individual fields from the device when compiling/linking
will lead to an unnecessarily large number of fields getting passed
around.
v3: Rebase on current master
v2: Use device in function args before making additional changes in
following patches
Signed-off-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Pierre Moreau <pierre.morrow@free.fr>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Timothy Arceri [Thu, 1 Mar 2018 04:21:52 +0000 (15:21 +1100)]
radeonsi/nir: fix handling of doubles for gs inputs
Fixes piglit test:
tests/spec/arb_gpu_shader_fp64/execution/explicit-location-gs-fs-vs.shader_test
Reviewed-by: Dave Airlie <airlied@redhat.com>
Timothy Arceri [Thu, 1 Mar 2018 04:37:25 +0000 (15:37 +1100)]
ac: pass the unmodified number of components to load gs inputs
Currently both users of this would overflow an array when the
input was a dual slot double as they expected the number of
components to be a max of 4.
Since we pass the type we can just let the functions handle
doubles in a way they choose.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Timothy Arceri [Thu, 1 Mar 2018 04:17:34 +0000 (15:17 +1100)]
radeonsi: move si_nir_load_input_gs() to si_shader.c
All the tess shader and tgsi equivalents are here and it allows
use to use llvm_type_is_64bit() in the following patch without
exposing it externally.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Boris Brezillon [Thu, 11 Jan 2018 09:22:04 +0000 (10:22 +0100)]
broadcom/vc4: Add support for HW perfmon
The V3D engine provides several perf counters.
Implement ->get_driver_query_[group_]info() so that these counters are
exposed through the GL_AMD_performance_monitor extension.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Boris Brezillon [Thu, 11 Jan 2018 09:22:03 +0000 (10:22 +0100)]
drm-uapi: Update vc4 header with perfmon related definitions
v2: Update to the final version with the documentation.
Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Signed-off-by: Eric Anholt <eric@anholt.net>
Roland Scheidegger [Mon, 5 Mar 2018 19:12:32 +0000 (20:12 +0100)]
r600: fix color export mask
The r600 code (not the eg one) forgot to copy the ps_color_export_mask
in commit
5b14e06d8b42e2b08ebc52b6c314ef8647d87a1f when updating the
pixel state, leading to misrenderings (probably with MRT).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105262
Tested-by: LoneVVolf <lonewolf@xs4all.nl>
Tested-by: Pavel Vinogradov <public@sourcemage.org>
Andres Gomez [Mon, 5 Mar 2018 15:25:36 +0000 (17:25 +0200)]
travis: keep meson version below 0.45.0
Recently Meson upgraded to 0.45.0 and it needs python 3.5+, which is
not available in Trusty.
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Kenneth Graunke [Wed, 14 Feb 2018 02:13:51 +0000 (18:13 -0800)]
intel: Drop SURFACE_FORMAT enum from genxml.
We want people to be using ISL_FORMAT_*, rather than the genxml format
enumerations. This patch drops 10 separate copies, and drops a bunch
of ugly casting.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
[jordan.l.justen@intel.com: Minor changes for rebase]
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jordan Justen [Tue, 27 Feb 2018 04:31:22 +0000 (20:31 -0800)]
intel/common: Use isl for decoder surface formats
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jordan Justen [Tue, 27 Feb 2018 01:57:19 +0000 (17:57 -0800)]
intel/isl: Add isl_format_is_valid
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jordan Justen [Mon, 26 Feb 2018 23:39:59 +0000 (15:39 -0800)]
intel: Split gen_device_info out into libintel_dev
Split out the device info so isl doesn't depend on intel/common. Now
it will depend on the new intel/dev device info lib.
This will allow the decoder in intel/common to use isl, allowing us to
apply Ken's patch that removes the genxml duplication of surface
formats.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Gert Wollny [Wed, 28 Feb 2018 13:50:21 +0000 (14:50 +0100)]
gallium/aux/hud: Avoid possible buffer overflow
Limit the length of acceptable cpu names for use in hud_get_num_cpufreq
in order to avoid a buffer overflow later in add_object when this name
is copied into cpufreq_info::name.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105274
Signed-off-by: Gert Wollny <gw.fossdev@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Eric Engestrom [Wed, 28 Feb 2018 16:08:54 +0000 (16:08 +0000)]
gbm: give a name to rgba fields
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Andres Gomez [Fri, 2 Mar 2018 11:28:28 +0000 (13:28 +0200)]
egl: remove duplicated initialization
Found by inspection.
The line removed is a duplicate of the line literally just above the
the 3 lines context usually printed in a commit log.
v2: enhance the commit log (Emil).
Cc: Ian Romanick <ian.d.romanick@intel.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Rob Clark [Fri, 2 Mar 2018 15:21:55 +0000 (10:21 -0500)]
freedreno/ir3: start dealing with half-precision
Some instructions, assume src and/or dst is half-precision based on a
type field (ie. f32/s32/u32 are full precision but others are half
precision). So add some code to sanity check the src/dst registers to
catch mixups.
Also propagate half-precision flag for SSA sources. The instruction
consuming a SSA value needs to be of the same type as the one producing
it.
This is probably not complete half-precision support, but a useful first
step. We do still need to add support for nir alu instructions for
converting between half/full precision.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Wed, 28 Feb 2018 22:33:29 +0000 (17:33 -0500)]
freedreno/ir3: fix fixing-up register footprint
It isn't just vertex shaders that need to fixup reg footprint for inputs
populated before shader starts.
This problem showed up with compute shaders. If you have (for example)
a localregid sysval, but only the .x component is used, the hw still
writes the .yz components, which could overflow into other threads
causing corruption. Showed up in cl cts 'basic/test_basic intmath_int'.
But in theory the same problem could crop up elsewhere.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Tue, 27 Feb 2018 17:59:57 +0000 (12:59 -0500)]
freedreno: surfaces can be PIPE_BUFFER
At least for clover.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Mon, 26 Feb 2018 18:38:22 +0000 (13:38 -0500)]
freedreno/a5xx: handle compute resources
Not *entirely* sure why this is a different BIND bit, but it is.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Mon, 26 Feb 2018 18:04:21 +0000 (13:04 -0500)]
freedreno/ir3: ignore return jump
I think this should also always only occur at the end of a BB (by
definition), and the BB successor should be the end block.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Mon, 26 Feb 2018 16:29:05 +0000 (11:29 -0500)]
freedreno: add some more compute caps
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Mon, 26 Feb 2018 16:24:13 +0000 (11:24 -0500)]
freedreno/a5xx: don't expose 64b pointers yet
Temporary hack, but since we can't do 64b math yet in ir3, pretend that
we don't support 64b pointers.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Mon, 26 Feb 2018 16:22:33 +0000 (11:22 -0500)]
freedreno: steal handy macro for compute caps from nouveau
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sun, 25 Feb 2018 21:11:06 +0000 (16:11 -0500)]
freedreno: add global_bindings state
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sun, 25 Feb 2018 20:05:01 +0000 (15:05 -0500)]
freedreno/ir3: small cleanup
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sun, 25 Feb 2018 20:01:07 +0000 (15:01 -0500)]
freedreno: add pctx->memory_barrier()
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Sat, 24 Feb 2018 16:33:09 +0000 (11:33 -0500)]
freedreno/ir3: cmdline compiler updates for spv shaders
Signed-off-by: Rob Clark <robdclark@gmail.com>
Samuel Pitoiset [Fri, 2 Mar 2018 14:01:32 +0000 (15:01 +0100)]
ac: add ac_build_fsign()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Samuel Pitoiset [Fri, 2 Mar 2018 14:01:31 +0000 (15:01 +0100)]
ac: add ac_build_isign()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Samuel Pitoiset [Fri, 2 Mar 2018 14:01:30 +0000 (15:01 +0100)]
ac: add ac_build_fract()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
gurchetansingh@chromium.org [Fri, 23 Feb 2018 02:02:18 +0000 (18:02 -0800)]
virgl: add offset alignment values to to v2 caps struct
glBindBufferRange(..) in vrend_draw_bind_ubo is failing with
more than one uniform block. This is due to improper alignment
of the start of the second block. Let's query the proper
alignment from the driver and pass it back to Mesa.
Let's query for the texture alignment too, even though the Virgl
renderer doesn't call glTexBufferRange yet.
The default values are the widest workable range possible (for example,
GL_UNIFORM_BUFFER_OFFSET_ALIGNMENT on Nvidia is 256).
Fixes:
dEQP-GLES3.functional.ubo.* on Nvidia
Example test:
dEQP-GLES3.functional.ubo.multi_basic_types.single_buffer.shared_vertex
Note: This is based on "virgl: reduce some default capset limits.",
which hasn't landed in Mesa yet but should relatively soon.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 21 Feb 2018 01:48:05 +0000 (11:48 +1000)]
virgl: reduce some default capset limits.
Since v2 might take a while to rollout, we should reduce
these inside some gathered minimums and then v2 can increase
them using host values.
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 15 Feb 2018 04:20:37 +0000 (14:20 +1000)]
virgl: handle getting new capsets.
This checks the kernel api is new enough and asks for the
larger caps size since the kernel won't mess it up now.
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Timothy Arceri [Mon, 5 Mar 2018 01:06:01 +0000 (12:06 +1100)]
radeonsi/nir: call ac_lower_indirect_derefs()
Fixes piglit tests:
tests/spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec3-index-rd.shader_test
tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Timothy Arceri [Mon, 5 Mar 2018 01:04:47 +0000 (12:04 +1100)]
radeonsi: add chip class to compiler_ctx_state
This will be used in the following patch.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Timothy Arceri [Mon, 5 Mar 2018 00:13:11 +0000 (11:13 +1100)]
ac/radv: move lower_indirect_derefs() to ac_nir_to_llvm.c
Until llvm handles indirects better we will need to use these
workarounds in the radeonsi backend also.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bas Nieuwenhuizen [Sun, 4 Mar 2018 13:47:20 +0000 (14:47 +0100)]
radv: Fix copying from 3D images starting at non-zero depth.
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Vinson Lee [Sat, 24 Feb 2018 23:49:32 +0000 (15:49 -0800)]
swr/rast: Fix macOS macro.
Fixes: a25093de7188 ("swr/rast: Implement JIT shader caching to disk")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
Mathias Fröhlich [Wed, 28 Feb 2018 07:31:44 +0000 (08:31 +0100)]
vbo: Try to reuse the same VAO more often for successive dlists.
The change tries to catch more opportunities to reuse the same set
of VAO's when building up display lists. Instead of checking the
offset with respect to the beginning of the vertex buffer object
the change tries to apply this same optimization with respect to the
previous display list node.
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Ian Romanick [Thu, 22 Feb 2018 03:23:44 +0000 (19:23 -0800)]
mesa: Silence unused parameter warnings from TEXSTORE_PARAMS
Reduces my build from 1717 warnings to 1547 warnings by silencing 170
instances of things like
In file included from ../../SOURCE/master/src/mesa/main/texcompress_bptc.h:30:0,
from ../../SOURCE/master/src/mesa/main/texcompress_bptc.c:31:
../../SOURCE/master/src/mesa/main/texcompress_bptc.c: In function ‘_mesa_texstore_bptc_rgba_unorm’:
../../SOURCE/master/src/mesa/main/texstore.h:60:14: warning: unused parameter ‘dstFormat’ [-Wunused-parameter]
mesa_format dstFormat, \
^
../../SOURCE/master/src/mesa/main/texcompress_bptc.c:1276:32: note: in expansion of macro ‘TEXSTORE_PARAMS’
_mesa_texstore_bptc_rgba_unorm(TEXSTORE_PARAMS)
^~~~~~~~~~~~~~~
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Ian Romanick [Thu, 22 Feb 2018 00:16:53 +0000 (16:16 -0800)]
i965: Silence unused parameter warnings in genX_state_upload
Reduces my build from 1772 warnings to 1717 warnings by silencing 55
instances of things like
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_emit_vertex_buffer_state’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:313:41: warning: unused parameter ‘end_offset’ [-Wunused-parameter]
unsigned end_offset,
^~~~~~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_emit_sampler_state_pointers_xs’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4689:58: warning: unused parameter ‘brw’ [-Wunused-parameter]
genX(emit_sampler_state_pointers_xs)(struct brw_context *brw,
^~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4690:62: warning: unused parameter ‘stage_state’ [-Wunused-parameter]
struct brw_stage_state *stage_state)
^~~~~~~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_upload_default_color’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4730:40: warning: unused parameter ‘format’ [-Wunused-parameter]
mesa_format format, GLenum base_format,
^~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘translate_wrap_mode’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4906:41: warning: unused parameter ‘brw’ [-Wunused-parameter]
translate_wrap_mode(struct brw_context *brw, GLenum wrap, bool using_nearest)
^~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c: In function ‘gen4_update_sampler_state’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_state_upload.c:4972:37: warning: unused parameter ‘batch_offset_for_sampler_state’ [-Wunused-parameter]
uint32_t batch_offset_for_sampler_state)
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Ian Romanick [Wed, 21 Feb 2018 02:42:02 +0000 (18:42 -0800)]
isl: Silence unused parameter warnings in __gen_combine_address implementations
Reduces my build from 1808 warnings to 1772 warnings by silencing 36
instances of things like
../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c: In function ‘__gen_combine_address’:
../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:29: warning: unused parameter ‘data’ [-Wunused-parameter]
__gen_combine_address(void *data, void *loc, uint64_t addr, uint32_t delta)
^~~~
../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:30:41: warning: unused parameter ‘loc’ [-Wunused-parameter]
__gen_combine_address(void *data, void *loc, uint64_t addr, uint32_t delta)
^~~
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Ian Romanick [Sat, 17 Feb 2018 03:09:13 +0000 (19:09 -0800)]
genxml: Silence unused parameter warnings in generated pack code
Reduces my build from 1960 warnings to 1808 warnings by silencing 152
instances of things like
In file included from ../../SOURCE/master/src/intel/genxml/genX_pack.h:32:0,
from ../../SOURCE/master/src/intel/isl/isl_emit_depth_stencil.c:36:
src/intel/genxml/gen4_pack.h: In function ‘__gen_uint’:
src/intel/genxml/gen4_pack.h:58:49: warning: unused parameter ‘end’ [-Wunused-parameter]
__gen_uint(uint64_t v, uint32_t start, uint32_t end)
^~~
src/intel/genxml/gen4_pack.h: In function ‘__gen_offset’:
src/intel/genxml/gen4_pack.h:94:35: warning: unused parameter ‘start’ [-Wunused-parameter]
__gen_offset(uint64_t v, uint32_t start, uint32_t end)
^~~~~
src/intel/genxml/gen4_pack.h:94:51: warning: unused parameter ‘end’ [-Wunused-parameter]
__gen_offset(uint64_t v, uint32_t start, uint32_t end)
^~~
src/intel/genxml/gen4_pack.h: In function ‘__gen_ufixed’:
src/intel/genxml/gen4_pack.h:133:48: warning: unused parameter ‘end’ [-Wunused-parameter]
__gen_ufixed(float v, uint32_t start, uint32_t end, uint32_t fract_bits)
^~~
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Ian Romanick [Sat, 17 Feb 2018 03:00:21 +0000 (19:00 -0800)]
i965: Silence unused parameter warnings in blorp
Reduces my build from 2023 warnings to 1960 warnings by silencing 63
instances of things like
In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:33:0:
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_cc_viewport’:
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:500:51: warning: unused parameter ‘params’ [-Wunused-parameter]
const struct blorp_params *params)
^~~~~~
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h: In function ‘blorp_emit_sampler_state’:
../../SOURCE/master/src/intel/blorp/blorp_genX_exec.h:524:53: warning: unused parameter ‘params’ [-Wunused-parameter]
const struct blorp_params *params)
^~~~~~
In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:36:0:
../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h: In function ‘blorp_emit_vs_state’:
../../SOURCE/master/src/mesa/drivers/dri/i965/gen4_blorp_exec.h:50:48: warning: unused parameter ‘params’ [-Wunused-parameter]
const struct blorp_params *params)
^~~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c: In function ‘blorp_flush_range’:
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:39: warning: unused parameter ‘batch’ [-Wunused-parameter]
blorp_flush_range(struct blorp_batch *batch, void *start, size_t size)
^~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:52: warning: unused parameter ‘start’ [-Wunused-parameter]
blorp_flush_range(struct blorp_batch *batch, void *start, size_t size)
^~~~~
../../SOURCE/master/src/mesa/drivers/dri/i965/genX_blorp_exec.c:197:66: warning: unused parameter ‘size’ [-Wunused-parameter]
blorp_flush_range(struct blorp_batch *batch, void *start, size_t size)
^~~~
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Ian Romanick [Sat, 17 Feb 2018 01:48:57 +0000 (17:48 -0800)]
nir: Silence unused parameter warnings in generated nir_constant_expressions code
Reduces my build from 2075 warnings to 2023 warnings by silencing 52
instances of things like
src/compiler/nir/nir_constant_expressions.c: In function ‘evaluate_bfi’:
src/compiler/nir/nir_constant_expressions.c:1812:61: warning: unused parameter ‘bit_size’ [-Wunused-parameter]
evaluate_bfi(MAYBE_UNUSED unsigned num_components, unsigned bit_size,
^~~~~~~~
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Ian Romanick [Mon, 12 Feb 2018 19:26:39 +0000 (11:26 -0800)]
i965: Silence unused parameter warnings in generated OA code
Reduces my build from 6301 warnings to 2075 warnings by silencing 4226
instances of things like
src/mesa/drivers/dri/i965/i965@sta/brw_oa_hsw.c: In function ‘hsw__render_basic__gpu_core_clocks__read’:
src/mesa/drivers/dri/i965/i965@sta/brw_oa_hsw.c:41:62: warning: unused parameter ‘brw’ [-Wunused-parameter]
hsw__render_basic__gpu_core_clocks__read(struct brw_context *brw,
^~~
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Ian Romanick [Mon, 12 Feb 2018 19:16:55 +0000 (11:16 -0800)]
i965: Silence warnings about mixing enum and non-enum in conditional
Reduces my build from 6451 warnings to 6301 warnings by silencing 150
instances of
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_reg_type brw_inst_src1_type(const gen_device_info*, const brw_inst*)’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:802:55: warning: enumeral and non-enumeral type in conditional expression [-Wextra]
unsigned file = __builtin_strcmp("dst", #reg) == 0 ? \
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~
BRW_GENERAL_REGISTER_FILE : \
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
brw_inst_##reg##_reg_file(devinfo, inst); \
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h:811:1: note: in expansion of macro ‘REG_TYPE’
REG_TYPE(src1)
^~~~~~~~
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Ian Romanick [Mon, 12 Feb 2018 18:57:06 +0000 (10:57 -0800)]
intel/compiler: Silence unused parameter warnings in release builds
Reduces my build from 7005 warnings to 6451 warnings by silencing 554
instances of
In file included from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:28:0:
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src0_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:346:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
brw_inst_3src_a1_src0_imm(const struct gen_device_info *devinfo,
^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_3src_a1_src2_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:354:57: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
brw_inst_3src_a1_src2_imm(const struct gen_device_info *devinfo,
^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src0_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:362:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
brw_inst_set_3src_a1_src0_imm(const struct gen_device_info *devinfo,
^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_set_3src_a1_src2_imm’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:370:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
brw_inst_set_3src_a1_src2_imm(const struct gen_device_info *devinfo,
^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_inst.h: In function ‘brw_inst_imm_uq’:
../../SOURCE/master/src/intel/compiler/brw_inst.h:703:47: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
brw_inst_imm_uq(const struct gen_device_info *devinfo, const brw_inst *insn)
^~~~~~~
In file included from ../../SOURCE/master/src/intel/compiler/brw_shader.h:29:0,
from ../../SOURCE/master/src/intel/compiler/brw_disasm.c:29:
../../SOURCE/master/src/intel/compiler/brw_compiler.h: In function ‘brw_stage_has_packed_dispatch’:
../../SOURCE/master/src/intel/compiler/brw_compiler.h:1277:61: warning: unused parameter ‘devinfo’ [-Wunused-parameter]
brw_stage_has_packed_dispatch(const struct gen_device_info *devinfo,
^~~~~~~
../../SOURCE/master/src/intel/compiler/brw_disasm.c: In function ‘src_ia1’:
../../SOURCE/master/src/intel/compiler/brw_disasm.c:849:18: warning: unused parameter ‘_reg_file’ [-Wunused-parameter]
unsigned _reg_file,
^~~~~~~~~
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Ian Romanick [Mon, 12 Feb 2018 18:52:49 +0000 (10:52 -0800)]
i965: Silence unused parameter warnings
Reduces my build from 7119 warnings to 7005 warnings by silencing 114
instances of
In file included from ../../SOURCE/master/src/mesa/drivers/dri/i965/brw_context.h:46:0,
from ../../SOURCE/master/src/mesa/drivers/dri/i965/intel_pixel_read.c:38:
../../SOURCE/master/src/mesa/drivers/dri/i965/brw_bufmgr.h: In function ‘brw_bo_unmap’:
../../SOURCE/master/src/mesa/drivers/dri/i965/brw_bufmgr.h:258:47: warning: unused parameter ‘bo’ [-Wunused-parameter]
static inline int brw_bo_unmap(struct brw_bo *bo) { return 0; }
^~
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Kenneth Graunke [Tue, 27 Feb 2018 00:34:55 +0000 (16:34 -0800)]
intel: Drop program size pointer from vec4/fs assembly getters.
These days, we're just passing a pointer to a prog_data field, which
we already have access to. We can just use it directly.
(In the past, it was a pointer to a separate value.)
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Kenneth Graunke [Tue, 27 Feb 2018 07:41:33 +0000 (23:41 -0800)]
i965: Mark upload buffers with MAP_ASYNC and MAP_PERSISTENT.
This should have no practical impact. For the default uploader, we
don't really care, but for others, we may want to append more data
as the GPU is reading existing data, which means we need async and
persistent flags.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Kenneth Graunke [Tue, 27 Feb 2018 07:17:35 +0000 (23:17 -0800)]
i965: Generalize intel_upload.c to support multiple uploaders.
I'd like to reuse the upload logic for a new program cache, but the
buffers will need to have a different lifetime than the default
uploader, and also some address space restrictions. So, we can't
use a single uploader for both situations - we'll need two of them.
This creates a public 'uploader' structure, and adjusts the interface
to take an uploader rather than always using brw->upload. It should
have no functional change at the moment.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Anuj Phogat [Wed, 7 Feb 2018 01:09:09 +0000 (17:09 -0800)]
intel/compiler: Memory fence commit must always be enabled for gen10+
Commit bit in the message descriptor (Bit 13) must be always set
to true in CNL+ for memory fence messages. It also fixes a piglit
GPU hang on cnl+ in simulation environment.
Piglit test: arb_shader_image_load_store-shader-mem-barrier
See HSD ES #
1404612949
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Francisco Jerez [Sun, 25 Feb 2018 00:05:21 +0000 (16:05 -0800)]
Revert "i965/fs: Predicate byte scattered writes if needed"
This reverts commit
a4031bdfa927fb4c3c5d0bdadc70634f3c1a5eac. It's
redundant with the sample mask predication done at this point by the
common logical send lowering infrastructure, and rather buggy because
it wasn't applying the correct sample mask in shaders using discard,
since the dispatch mask returned by FS_OPCODE_MOV_DISPATCH_TO_FLAGS
doesn't reflect samples discarded by the shader, so it could have led
to data corruption in fragment shader invocations that execute discard
based on a non-dynamically uniform condition.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Francisco Jerez [Tue, 12 Dec 2017 20:05:04 +0000 (12:05 -0800)]
intel/fs: Handle surface opcode sample masks via predication.
The main motivation is to enable HDC surface opcodes on ICL which no
longer allows the sample mask to be provided in a message header, but
this is enabled all the way back to IVB when possible because it
decreases the instruction count of some shaders using HDC messages
significantly, e.g. one of the SynMark2 CSDof compute shaders
decreases instruction count by about 40% due to the removal of header
setup boilerplate which in turn makes a number of send message
payloads more easily CSE-able. Shader-db results on SKL:
total instructions in shared programs:
15325319 ->
15314384 (-0.07%)
instructions in affected programs: 311532 -> 300597 (-3.51%)
helped: 491
HURT: 1
Shader-db results on BDW where the optimization needs to be disabled
in some cases due to hardware restrictions:
total instructions in shared programs:
15604794 ->
15598028 (-0.04%)
instructions in affected programs: 220863 -> 214097 (-3.06%)
helped: 351
HURT: 0
The FPS of SynMark2 CSDof improves by 5.09% ±0.36% (n=10) on my SKL
laptop with this change. According to Eero this improves performance
of the same test by 9% on BYT and by 7-8% on BXT J4205 and on SKL GT2
desktop.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-By: Eero Tamminen <eero.t.tamminen@intel.com>
Francisco Jerez [Tue, 12 Dec 2017 20:05:03 +0000 (12:05 -0800)]
intel/eu: Plumb header present bit to codegen helpers for HDC messages.
This makes sure that the header-present bit of the message descriptor
is in sync with the IR instruction fields, which gives the optimizer
more control to avoid the overhead of setting up a message header when
it's possible to do so.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Francisco Jerez [Thu, 22 Feb 2018 20:49:01 +0000 (12:49 -0800)]
intel/ir: Allow arbitrary scratch flag registers for SHADER_OPCODE_FIND_LIVE_CHANNEL.
This shouldn't cause any functional change at this point, it changes
SHADER_OPCODE_FIND_LIVE_CHANNEL to use the flag register specified at
the IR level instead of the hard-coded f1.0, now that it can be
represented in backend_instruction::flag_subreg. This will be
necessary for scheduling to behave correctly once more things start
making use of f1.0.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Francisco Jerez [Tue, 12 Dec 2017 20:05:02 +0000 (12:05 -0800)]
intel/ir: Allow representing additional flag subregisters in the IR.
This allows representing conditional mods and predicates on f1.0-f1.1
at the IR level by adding an extra bit to the flag_subreg
backend_instruction field.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Francisco Jerez [Tue, 12 Dec 2017 20:05:00 +0000 (12:05 -0800)]
intel/l3: Don't allocate SLM partition on ICL+.
SLM has a chunk of special-purpose memory separate from L3 on ICL+, we
shouldn't allocate a partition for it on L3 anymore.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Charmaine Lee [Tue, 27 Feb 2018 12:09:58 +0000 (04:09 -0800)]
svga: add SVGA_NEW_PRESCALE to the tracked dirty mask for gs
Since geometry shader also consumes prescale constants, the
geometry shader constant buffer will need to be updated when prescale
factor is changed.
Reviewed-by: Brian Paul <brianp@vmware.com>
Brian Paul [Thu, 22 Feb 2018 04:00:38 +0000 (21:00 -0700)]
svga: fix blending regression
The earlier Mesa commit
3d06c8afb5 ("st/mesa: don't translate blend
state when it's disabled for a colorbuffer") subtly changed the
details of gallium's per-RT blend state.
In particular, when pipe_rt_blend_state[i].blend_enabled is true,
we have to get the src/dst blend terms from pipe_rt_blend_state[i],
not [0] as before.
We now have to scan the blend targets to find the first one that's
enabled (if any). We have to use the index of that target for getting
the src/dst blend terms. And note that we have to set identical blend
terms for all targets.
This fixes the Piglit fbo-drawbuffers2-blend test. VMware bug
2063493.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Thu, 22 Feb 2018 20:22:11 +0000 (13:22 -0700)]
svga: check svga_have_vgpu10() in svga_delete_blend_state()
We were calling SVGA3D_vgpu10_DestroyBlendState() when vgpu10 was not
enabled (bs->id==0 by default), resulting in lots of device errors.
Reviewed-by: Neha Bhende<bhenden@vmware.com>
Brian Paul [Wed, 21 Feb 2018 20:57:39 +0000 (13:57 -0700)]
svga: if svga_update_state() fails, skip the draw call
If svga_update_state() fails, we flush the command buffer and retry.
If it fails again, it likely means we were unable to translate a shader
for some reason (uses too many resources, for example). In that case,
let's just skip the draw call. The alternative, just disabling the
shader stage in question, would certainly lead to bad rendering anyway,
and probably device errors.
Fixes failed assertion running Piglit glsl-1.50/execution/
variable-indexing/gs-output-array-vec4-index-wr.shader_test since it
uses too many GS output registers (though the test still fails).
VMware bug
2063492.
v2: also call pipe_debug_message() so apps or apitrace can be notified
when this issue occurs.
v3: use svga_update_state_retry().
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Brian Paul [Thu, 22 Feb 2018 16:32:33 +0000 (09:32 -0700)]
svga: let svga_update_state_retry() return a bool
This will allow minor simplifications elsewhere.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Brian Paul [Thu, 22 Feb 2018 21:43:41 +0000 (14:43 -0700)]
svga: s/unsigned/boolean/ for a few local vars
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Dylan Baker [Fri, 2 Mar 2018 18:28:11 +0000 (10:28 -0800)]
meson: install vulkan_intel.h header
Fixes: d1992255bb29054fa51763376d125183a9f602f3
("meson: Add build Intel "anv" vulkan driver")
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Boyuan Zhang [Fri, 2 Mar 2018 16:11:01 +0000 (11:11 -0500)]
st/omx_bellagio: add picture profile and entry point
Profile and entry point were missing in the picture structure.
Therefore, add them back.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Boyuan Zhang [Tue, 27 Feb 2018 22:29:44 +0000 (17:29 -0500)]
radeonsi: fix radeon create encoder return
Previous patch missed a "return" when trying to modify the create encoder
function, which made the whole logic fail. Therefore, add the return back.
Fixes: b38b208ff8886e799d6a2 "radeonsi:create uvd hevc enc entry"
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Thierry Reding [Wed, 21 Dec 2016 13:15:06 +0000 (14:15 +0100)]
loader: Add support for platform and host1x busses
ARM SoCs usually have their DRM/KMS devices on the platform bus, so add
support for this bus in order to allow use of the DRI_PRIME environment
variable with those devices.
While at it, also support the host1x bus, which is effectively the same
but uses an additional layer in the bus hierarchy.
Note that it isn't enough to support the bus that has the rendering GPU
because the loader code will also try to construct an ID path tag for a
scanout-only device if it is the default that is being opened.
The ID path tag for a device can be obtained by running udevadm info on
the device node, as shown in this example on NVIDIA Tegra:
$ udevadm info /dev/dri/card0 | grep ID_PATH_TAG
E: ID_PATH_TAG=platform-50000000_host1x
The corresponding OF_FULLNAME property, from which the ID_PATH_TAG is
constructed, can be found in the sysfs "uevent" attribute for the card0
device's parent:
$ grep OF_FULLNAME /sys/devices/platform/
50000000.host1x/drm/uevent
OF_FULLNAME=/host1x@
50000000
Similarily, /dev/dri/card1 corresponds to the GPU:
$ udevadm info /dev/dri/card1 | grep ID_PATH_TAG
E: ID_PATH_TAG=platform-57000000_gpu
and:
$ grep OF_FULLNAME /sys/devices/platform/
57000000.gpu/uevent
OF_FULLNAME=/gpu@
57000000
Changes in v2:
- avoid confusing pre-increment in strdup()
- add examples of tags to commit message
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Thierry Reding [Fri, 23 Feb 2018 13:13:27 +0000 (14:13 +0100)]
disk cache: Link with -latomic if necessary
The disk cache implementation uses 64-bit atomic operations. For some
architectures, such as 32-bit ARM, GCC will not be able to translate
these operations into atomic, lock-free instructions and will instead
rely on the external atomics library to provide these operations.
Check at configuration time whether or not linking against libatomic
is necessary and if so, create a dependency that can be used while
linking the mesautil library.
This is the meson equivalent of
2ef7f23820a6 ("configure: check if
-latomic is needed for __atomic_*").
For some background information on this, see:
https://gcc.gnu.org/wiki/Atomic/GCCMM
Changes in v2:
- clarify meaning of lock-free in commit message
- fix build if -latomic is not necessary
Acked-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Signed-off-by: Thierry Reding <treding@nvidia.com>
Samuel Pitoiset [Thu, 1 Mar 2018 09:53:49 +0000 (10:53 +0100)]
radv: do not set pending_reset_query in BeginCommandBuffer()
This is just useless for two reasons:
1) flush_bits is not set accordingly, so nothing will be flushed
in BeginQuery().
2) we always flush caches in EndCommandBuffer(), so if a reset
is done in a previous command buffer we are safe.
Cc: "18.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Dave Airlie [Thu, 1 Mar 2018 03:38:32 +0000 (03:38 +0000)]
r600/cayman: fix fragcood loading recip generation.
This fixes some hangs seen where the recip_ieee opcodes would
end up split across the wrong slots.
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Kenneth Graunke [Mon, 12 Feb 2018 15:18:29 +0000 (07:18 -0800)]
i965: Allow 48-bit addressing on Gen8+.
This allows most GPU objects to use the full 48-bit address space
offered by Gen8+ platforms, rather than being stuck with 32-bit.
This expands the available GPU memory from 4G to 256TB or so.
A few objects - instruction, scratch, and vertex buffers - need to
remain pinned in the low 4GB of the address space for various reasons.
We default everything to 48-bit but disable it in those cases.
Thanks to Jason Ekstrand for blazing this trail in anv first and
finding the nasty undocumented hardware issues. This patch simply
rips off all of his findings.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Mon, 26 Feb 2018 23:51:04 +0000 (15:51 -0800)]
i965: Shorten the name of the workaround BO.
This makes the name shorter in debug printouts. If "workaround_bo"
is good enough for the code, it's probably good enough for debugging.
Kenneth Graunke [Tue, 28 Nov 2017 18:07:43 +0000 (10:07 -0800)]
i965: Add debugging code to dump the validation list.
When anything goes wrong with this code, dumping the validation list
is a useful way to figure out what's happening.
Jason Ekstrand [Thu, 1 Mar 2018 03:57:44 +0000 (19:57 -0800)]
intel/fs: Set up sampler message headers in the visitor on gen7+
This gives the scheduler visibility into the headers which should
improve scheduling. More importantly, however, it lets the scheduler
know that the header gets written. As-is, the scheduler thinks that a
texture instruction only reads it's payload and is unaware that it may
write to the first register so it may reorder it with respect to a read
from that register. This is causing issues in a couple of Dota 2 vertex
shaders.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104923
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Timothy Arceri [Thu, 1 Mar 2018 09:17:38 +0000 (20:17 +1100)]
ac: fix nir_intrinsic_shared_atomic_comp_swap handling
Following on from
49879f377870 this makes sure we use the correct
src index.
Fixes cts test:
KHR-GL46.compute_shader.atomic-case3
Reviewed-by: Dave Airlie <airlied@redhat.com>
Timothy Arceri [Thu, 1 Mar 2018 02:39:20 +0000 (13:39 +1100)]
st/glsl_to_nir: simplify st_nir_assign_var_locations() and fix for fs outputs
We only need to check for previously processed location on user
defined varyings as they are the only ones that support component
packing. Therefore a single instance of processed_locs can be
shared by regular varyings and patches.
For simplicity we make processed_locs an array in order to handle
dual source bleanding.
Fixes the follow piglit test on radeonsi:
tests/spec/arb_enhanced_layouts/execution/component-layout/fs-output.shader_test
Reviewed-by: Dave Airlie <airlied@redhat.com>
Jason Ekstrand [Sat, 24 Feb 2018 05:12:50 +0000 (21:12 -0800)]
anv: Enable MSAA fast-clears
This speeds up the Sascha Willems multisampling demo by around 25% when
using 8x or 16x MSAA.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Sat, 24 Feb 2018 05:12:35 +0000 (21:12 -0800)]
anv/cmd_buffer: Add support for MCS fast-clears and resolves
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Sat, 24 Feb 2018 05:00:52 +0000 (21:00 -0800)]
anv/cmd_buffer: Add helpers for computing resolve predicates
We'll want to re-use the complex resolve predicate computations for MCS
resolves so it's nice to have them as helper functions.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Sat, 24 Feb 2018 04:45:26 +0000 (20:45 -0800)]
anv/cmd_buffer: Handle MCS identical to CCS_E in compute_aux_usage
This doesn't actually do anything because att_state->fast_clear is
determined based on the return value of anv_layout_to_fast_clear_type
which currently returns NONE for multisampled images.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Sat, 24 Feb 2018 05:11:58 +0000 (21:11 -0800)]
anv/blorp: Pass the clear address to blorp for subpass MSAA resolves
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Sat, 24 Feb 2018 06:05:39 +0000 (22:05 -0800)]
anv/blorp: Allow indirect clear colors on blorp sources on gen7
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Sat, 11 Nov 2017 22:32:21 +0000 (14:32 -0800)]
anv/blorp: Add partial clear support to anv_image_mcs_op
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Sat, 11 Nov 2017 22:28:17 +0000 (14:28 -0800)]
intel/blorp: Add indirect clear color support to mcs_partial_resolve
This is a bit complicated because we have to get the indirect clear
color in there somehow. In order to not do any more work in the shader
than needed, we set it up as it's own vertex binding which points
directly at the clear color address specified by the client.
Acked-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Jason Ekstrand [Sat, 11 Nov 2017 21:40:03 +0000 (13:40 -0800)]
intel/blorp: Add a helper for filling out VERTEX_BUFFER_STATE
There are enough #ifs in there that it's kind-of pointless to duplicate
it for each buffer.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Andriy Khulap [Thu, 1 Mar 2018 08:44:28 +0000 (10:44 +0200)]
i965: Fix RELOC_WRITE typo in brw_store_data_imm64()
Fixes: 6c530ad11605
("i965: Reduce passing 2x32b of reloc_domains to 2 bits")
Signed-off-by: Andriy Khulap <andriy.khulap@globallogic.com>
Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jonathan Gray [Wed, 28 Feb 2018 10:21:14 +0000 (21:21 +1100)]
gallium/util: use sockets on PIPE_OS_UNIX in u_network
Instead of listing all the UNIX PIPE_OS platforms just use
PIPE_OS_UNIX. Makes BSD sockets available on PIPE_OS_BSD.
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
Jonathan Gray [Wed, 28 Feb 2018 10:19:19 +0000 (21:19 +1100)]
util: use clock_gettime() on PIPE_OS_BSD
OpenBSD, FreeBSD, NetBSD and DragonFlyBSD all have clock_gettime()
so use it when PIPE_OS_BSD is defined.
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
Jose Maria Casanova Crespo [Thu, 1 Mar 2018 17:06:52 +0000 (18:06 +0100)]
nir/search: Include 8 and 16-bit support in construct_value
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Jason Ekstrand [Wed, 28 Feb 2018 21:15:04 +0000 (13:15 -0800)]
nir/search: Support 8 and 16-bit constants in match_value
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Andres Gomez [Wed, 28 Feb 2018 21:18:59 +0000 (23:18 +0200)]
travis: make Meson find the proper llvm-config
Travis CI has moved to LLVM 5.0, and meson is detecting automatically
the available version in /usr/local/bin based on the PATH env variable
order preference.
As for 0.44.x, Meson cannot receive the path to the llvm-config binary
as a configuration parameter. See
https://github.com/mesonbuild/meson/issues/2887 and
https://github.com/dcbaker/meson/commit/
7c8b6ee3fa42f43c9ac7dcacc61a77eca3f1bcef
We want to use the custom (APT) installed version. Therefore, let's
make Meson find our wanted version sooner than the one at
/usr/local/bin
Once this is corrected, we would still need a patch similar to:
https://lists.freedesktop.org/archives/mesa-dev/2017-December/180217.html
v2: Create the link only to the specificly wanted LLVM version (Gert).
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Emil Velikov <emil.velikov@collabora.com>
Cc: Juan A. Suarez Romero <jasuarez@igalia.com>
Cc: Gert Wollny <gw.fossdev@gmail.com>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-and-Tested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-By: Gert Wollny <gw.fossdev@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Andres Gomez [Wed, 28 Feb 2018 21:15:07 +0000 (23:15 +0200)]
meson: fix LLVM version detection when <= 3.4
3 digits versions in LLVM only started from 3.4.1 on.
Hence, even if you can perfectly build with an old LLVM (< 3.4.1) in
the system while not needing LLVM at all (auto), when passing through
the LLVM version detection code, meson will fail when accessing
"_llvm_version[2]" due to:
"Index 2 out of bounds of array of size 2."
v2: Properly compare LLVM version and set patch version to 0
if < 3.4.1 (Eric).
v3: Improve the commit log explanation (Eric).
Cc: Dylan Baker <dylan@pnwbakers.com>
Cc: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Iago Toral Quiroga [Thu, 1 Mar 2018 06:59:42 +0000 (07:59 +0100)]
i965/sbe: fix number of inputs for active components
In
16631ca30ea6 we fixed gen9 active components to account for padded
inputs in the URB, which we can have with SSO programs. To do that,
instead of going through the bitfield of inputs (which doesn't include
padding information), we compute the number of inputs from the size
of the URB entry.
Unfortunately, there are some special inputs that are not stored in
the URB and that we also need to account for. These special inputs
are identified and handled during calculate_attr_overrides().
Instead of keeping track of the exact number of inputs, we just
program active components for all possible inputs like we do in
anvil.
This fixes a regression in a WebGL program that uses Point Sprite
functionality (specifically, VARYING_SLOT_PNTC).
v2:
- Add 'Fixes' tag (Mark Janes)
- make no_vue_inputs int instead of uint32_t, and add const qualifier
to num_inputs variable (Ian)
v3:
- Do not try to count inputs correctly, just program all input
slots like we do in anvil (Ken)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105224
Fixes: 16631ca30ea6 (i965/sbe: fix active components for SSO programs with over 16 inputs)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Samuel Pitoiset [Wed, 28 Feb 2018 19:28:53 +0000 (20:28 +0100)]
radv: only emit cache flushes when the pool size is large enough
This is an optimization which reduces the number of flushes for
small pool buffers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 28 Feb 2018 19:22:29 +0000 (20:22 +0100)]
radv: keep track of the query pool size
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>