mesa.git
5 years agoutil: use C99 declaration in the for-loop set_foreach() macro
Eric Engestrom [Sat, 20 Oct 2018 17:00:09 +0000 (18:00 +0100)]
util: use C99 declaration in the for-loop set_foreach() macro

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agoutil: use C99 declaration in the for-loop hash_table_foreach() macro
Eric Engestrom [Sat, 20 Oct 2018 17:00:08 +0000 (18:00 +0100)]
util: use C99 declaration in the for-loop hash_table_foreach() macro

Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agogen: Add AMD_gpu_shader_int64.xml to tarball
Dylan Baker [Tue, 23 Oct 2018 17:02:05 +0000 (10:02 -0700)]
gen: Add AMD_gpu_shader_int64.xml to tarball

CC: Ian Romanick <ian.d.romanick@intel.com>
CC: Marek Olšák <marek.olsak@amd.com>
Fixes: b3c17330e631695b5e5dc209ba9ea1a528618c97
       ("mesa: expose AMD_gpu_shader_int64")
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
5 years agogen: Add EXT_vertex_attrib_64bit.xml to dependency lists
Dylan Baker [Tue, 23 Oct 2018 17:00:01 +0000 (10:00 -0700)]
gen: Add EXT_vertex_attrib_64bit.xml to dependency lists

Which is also required to put it in the tarball, a requirement for
building with meson from the tarball.

CC: Ian Romanick <ian.d.romanick@intel.com>
CC: Marek Olšák <marek.olsak@amd.com>
Fixes: 263c962cfdee6b43578ee5f28601309ea77d1434
       ("mesa: expose EXT_vertex_attrib_64bit")
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
5 years agoanv: move variable to proper scope and mark as MAYBE_UNUSED
Eric Engestrom [Tue, 23 Oct 2018 14:37:21 +0000 (15:37 +0100)]
anv: move variable to proper scope and mark as MAYBE_UNUSED

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv: use snprintf() instead of memset()+strcpy()
Eric Engestrom [Tue, 23 Oct 2018 14:27:51 +0000 (15:27 +0100)]
anv: use snprintf() instead of memset()+strcpy()

snprintf() guarantees that it will not write more chars than allowed,
and that the string will be null-terminated, without the need to fill
the whole thing with zeroes to begin with.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoanv: drop unused includes
Eric Engestrom [Tue, 23 Oct 2018 14:25:45 +0000 (15:25 +0100)]
anv: drop unused includes

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agoautotools: include intel_tiled_memcopy.c
Dylan Baker [Tue, 23 Oct 2018 18:01:12 +0000 (11:01 -0700)]
autotools: include intel_tiled_memcopy.c

There are two problems with the fixed patch. First, it fails to create a
dependency on the sourced .c file, so changes to intel_tiled_memcpy.c
won't trigger a rebuild. It also doesn't get included in the dist
tarball.

Fixes: 11b1afdc92db98e93f2ca50beeb7fc481a11e708
       ("i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
5 years agomeson: fix formatting and add extra_files to i965
Dylan Baker [Tue, 23 Oct 2018 17:40:15 +0000 (10:40 -0700)]
meson: fix formatting and add extra_files to i965

extra_files is just a nice way to to tell certain IDEs (and those
reading the file) that this file is also a dependency. Meson will use
the .d file generated by the compiler to figure out what the target
actually depends on.

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
5 years agoir3_compiler/nir: fix imageSize() for buffer-backed images
Eduardo Lima Mitev [Tue, 23 Oct 2018 19:24:11 +0000 (21:24 +0200)]
ir3_compiler/nir: fix imageSize() for buffer-backed images

GL_EXT_texture_buffer introduced texture buffers, which can be used
in shaders through a new type imageBuffer.

Because how image access is implemented in freedreno, calling
imageSize on an imageBuffer returns the size in bytes instead of texels,
which is incorrect.

This patch adds a division of imageSize result by the bytes-per-pixel
of the image format, when image is buffer-backed.

Fixes all tests under
dEQP-GLES31.functional.image_load_store.buffer.image_size.*

v2: Pre-compute and submit the log2 of the image format's bpp as shader
    constant instead of emitting the LOG2 instruction in code. (Rob Clark)

v3: Use ffs (find-first-bit) helper for computing log2 (Ilia Mirkin)

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agonir: Fix array initializer.
Jose Fonseca [Wed, 24 Oct 2018 10:33:09 +0000 (11:33 +0100)]
nir: Fix array initializer.

Empty initializer is not standard C.  This fixes MSVC build.

Trivial.

5 years agoscons: Put to rest zombie texture_float build option.
Liviu Prodea [Wed, 24 Oct 2018 10:08:35 +0000 (11:08 +0100)]
scons: Put to rest zombie texture_float build option.

I found a remnant of texture_float build option that wasn't removed in
commit 66673bef941af344314fe9c91cad8cd330b245eb

This patch removes it.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
5 years agoanv: Allow presenting via a different GPU
Alex Smith [Thu, 18 Oct 2018 16:29:37 +0000 (17:29 +0100)]
anv: Allow presenting via a different GPU

anv_GetPhysicalDeviceSurfaceSupportKHR will already return success for
this, but anv_GetPhysicalDevice{Xcb,Xlib}PresentationSupportKHR do not.
Apps which check for presentation support via the latter (all Feral
Vulkan games at least) will therefore fail.

This allows me to render on an Intel GPU and present to a display
connected to an AMD card (tested HD 530 + Vega 64).

v2: Rebase on current master.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agonir: fix nir_copy_propagation test
Juan A. Suarez Romero [Tue, 23 Oct 2018 13:55:11 +0000 (15:55 +0200)]
nir: fix nir_copy_propagation test

Use nir_src_comp_as_uint() to read the proper second component, as
nir_src_as_uint() returns the first one.

v2: Use nir_src_comp_as_uint() [Jason]

Fixes: 16870de8a0a ("nir: Use nir_src_is_const and nir_src_as_* in core
                     code")
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108532
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
5 years agoradv: call nir_link_xfb_varyings()
Timothy Arceri [Tue, 23 Oct 2018 10:56:31 +0000 (21:56 +1100)]
radv: call nir_link_xfb_varyings()

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agoradv: move nir_lower_io_to_scalar_early() to radv_link_shaders()
Timothy Arceri [Tue, 23 Oct 2018 10:56:30 +0000 (21:56 +1100)]
radv: move nir_lower_io_to_scalar_early() to radv_link_shaders()

nir_lower_io_to_scalar_early() is really part of the link time
optimisations. Moving it here allows the code to be simplified
and also keeps the code easy to follow in the next patch.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
5 years agonir: add linking helper nir_link_xfb_varyings()
Samuel Pitoiset [Tue, 23 Oct 2018 10:56:29 +0000 (21:56 +1100)]
nir: add linking helper nir_link_xfb_varyings()

The linking opts shouldn't try removing or compacting XFB varyings
in the consumer. To avoid this we copy the always_active_io flag
from the producer.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
5 years agointel/compiler: Change src1 reg type to unsigned doubleword
Sagar Ghuge [Sat, 20 Oct 2018 01:25:23 +0000 (18:25 -0700)]
intel/compiler: Change src1 reg type to unsigned doubleword

To have uniform behavior while disassembling send(c) instruction use
register type of unsigned doubleword for src1 when message descriptor is
immediate value. Bspec does not specifiy anything for src1 immediate
default type.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
5 years agomesa/glformats: Remove redundant helper _mesa_base_format_component_count
Eduardo Lima Mitev [Tue, 23 Oct 2018 05:56:58 +0000 (07:56 +0200)]
mesa/glformats: Remove redundant helper _mesa_base_format_component_count

There exists _mesa_components_in_format() which already includes
all cases handled in _mesa_base_format_component_count().

Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
5 years agonir/algebraic: Fix a typo in the bit size validation code
Jason Ekstrand [Mon, 22 Oct 2018 23:29:52 +0000 (18:29 -0500)]
nir/algebraic: Fix a typo in the bit size validation code

The conon_bit_class and canon_var_class variables got switched.

Fixes: 932c650e0b "nir/algebraic: Loosen a restriction on variables"
Reported-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
5 years agoamd/common: check DRM version 3.27 for JPEG decode
Leo Liu [Tue, 23 Oct 2018 16:57:31 +0000 (12:57 -0400)]
amd/common: check DRM version 3.27 for JPEG decode

JPEG was added after DRM version 3.26

Signed-off-by: Leo Liu <leo.liu@amd.com>
Fixes: 4558758c51749(amd/common: add vcn jpeg ip info query)
Cc: Boyuan Zhang <boyuan.zhang@amd.com>
Cc: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
5 years agodocs: update calendar
Juan A. Suarez Romero [Fri, 5 Oct 2018 09:14:59 +0000 (11:14 +0200)]
docs: update calendar

I'll take care of 18.2 releases series on Andres behalf.

CC: Andres Gomez <agomez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
5 years agointel/decoders: fix end of batch limit
Lionel Landwerlin [Tue, 23 Oct 2018 00:39:39 +0000 (01:39 +0100)]
intel/decoders: fix end of batch limit

Pointer arithmetic...

v2: s/4/sizeof(uint32_t)/ (Eric)

v3: Give bytes to print_batch() in error_decode (Lionel)
    Make clear what values we're dealing with in error_decode (Lionel)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v2)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
5 years agoradeonsi: enable vcn jpeg decode for raven
Boyuan Zhang [Wed, 17 Oct 2018 19:03:30 +0000 (15:03 -0400)]
radeonsi: enable vcn jpeg decode for raven

Enable vcn jpeg decode for raven.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agowinsys/amdgpu: add vcn jpeg cs support
Boyuan Zhang [Wed, 17 Oct 2018 19:03:29 +0000 (15:03 -0400)]
winsys/amdgpu: add vcn jpeg cs support

Add vcn jpeg cs support, align cs by no-op.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agoamd/common: add vcn jpeg ip info query
Boyuan Zhang [Wed, 17 Oct 2018 19:03:28 +0000 (15:03 -0400)]
amd/common: add vcn jpeg ip info query

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agoradeon/vcn: implement jpeg target buffer cmd
Boyuan Zhang [Wed, 17 Oct 2018 19:03:27 +0000 (15:03 -0400)]
radeon/vcn: implement jpeg target buffer cmd

Implement jpeg target buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
5 years agoradeon/vcn: implement jpeg bitstream buffer cmd
Boyuan Zhang [Wed, 17 Oct 2018 19:03:26 +0000 (15:03 -0400)]
radeon/vcn: implement jpeg bitstream buffer cmd

Implement jpeg bitstream buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
5 years agoradeon/uvd: remove get mjpeg slice header
Boyuan Zhang [Wed, 17 Oct 2018 19:03:25 +0000 (15:03 -0400)]
radeon/uvd: remove get mjpeg slice header

Move the previous get_mjpeg_slice_heaeder function and eoi from
"radeon/vcn" to "st/va".

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agost/va: get mjpeg slice header
Boyuan Zhang [Wed, 17 Oct 2018 19:03:24 +0000 (15:03 -0400)]
st/va: get mjpeg slice header

Move the previous get_mjpeg_slice_heaeder function and eoi from
"radeon/vcn" to "st/va".

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agoradeon/vcn: add jpeg decode implementation
Boyuan Zhang [Wed, 17 Oct 2018 19:03:23 +0000 (15:03 -0400)]
radeon/vcn: add jpeg decode implementation

Add a new file to handle VCN Jpeg decode specific functions. Use Jpeg
specific cmd sending function in end_frame call.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agoradeon/vcn: separate send cmd call from end frame
Boyuan Zhang [Wed, 17 Oct 2018 19:03:22 +0000 (15:03 -0400)]
radeon/vcn: separate send cmd call from end frame

Use function pointer for sending cmd in end_frame call. By doing this, we can
assign different cmd sending logics for Jpeg decode later.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agoradeon/vcn: create cs based on ring type
Boyuan Zhang [Wed, 17 Oct 2018 19:03:21 +0000 (15:03 -0400)]
radeon/vcn: create cs based on ring type

Add RING_VCN_JPEG for VCN Jpeg decode, and keep RING_VCN_DEC for other codecs.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agoradeon/winsys: add vcn jpeg ring type
Boyuan Zhang [Wed, 17 Oct 2018 19:03:20 +0000 (15:03 -0400)]
radeon/winsys: add vcn jpeg ring type

Add a new ring type for vcn jpeg.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agoradeon/vcn: add vcn jpeg decode interface
Boyuan Zhang [Wed, 17 Oct 2018 19:03:19 +0000 (15:03 -0400)]
radeon/vcn: add vcn jpeg decode interface

Add VCN Jpeg decode interfaces and register defines.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agoradeon/vcn: move radeon decoder define to header file
Boyuan Zhang [Wed, 17 Oct 2018 19:03:18 +0000 (15:03 -0400)]
radeon/vcn: move radeon decoder define to header file

Move radeon_decoder definition from "radeon_vcn_dec.c" to "radeon_vcn_dec.h",
so that it can be included by other files later.

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agomeson: update required amdgpu version to 2.4.95
Boyuan Zhang [Wed, 17 Oct 2018 19:03:17 +0000 (15:03 -0400)]
meson: update required amdgpu version to 2.4.95

VCN jpeg requires new hw ip

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agoconfigure.ac: update libdrm amdgpu version to 2.4.95
Boyuan Zhang [Wed, 17 Oct 2018 19:03:16 +0000 (15:03 -0400)]
configure.ac: update libdrm amdgpu version to 2.4.95

VCN jpeg requires new hw ip

Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
5 years agoradv: fix btoi for R32G32B32 when the dest offset is not 0
Samuel Pitoiset [Mon, 22 Oct 2018 13:42:31 +0000 (15:42 +0200)]
radv: fix btoi for R32G32B32 when the dest offset is not 0

Fixes: 593996bc02 ("radv: implement buffer to image operations for R32G32B32")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoi965/miptree: Use cpu tiling/detiling when mapping
Scott D Phillips [Mon, 24 Sep 2018 08:39:33 +0000 (11:39 +0300)]
i965/miptree: Use cpu tiling/detiling when mapping

Rename the (un)map_gtt functions to (un)map_map (map by
returning a map) and add new functions (un)map_tiled_memcpy that
return a shadow buffer populated with the intel_tiled_memcpy
functions.

Tiling/detiling with the cpu will be the only way to handle Yf/Ys
tiling, when support is added for those formats.

v2: Compute extents properly in the x|y-rounded-down case (Chris Wilson)

v3: Add units to parameter names of tile_extents (Nanley Chery)
    Use _mesa_align_malloc for the shadow copy (Nanley)
    Continue using gtt maps on gen4 (Nanley)

v4: Use streaming_load_memcpy when detiling

v5: (edited by Ken) Move map_tiled_memcpy above map_movntdqa, so it
    takes precedence.  Add intel_miptree_access_raw, needed after
    rebasing on commit b499b85b0f2cc0c82b7c9af91502c2814fdc8e67.

v6: refactor to changes done for sse41 separation (Tapani)

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
5 years agoi965/tiled_memcpy: inline movntdqa loads in tiled_to_linear
Scott D Phillips [Mon, 24 Sep 2018 05:33:06 +0000 (08:33 +0300)]
i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear

The reference for MOVNTDQA says:

    For WC memory type, the nontemporal hint may be implemented by
    loading a temporary internal buffer with the equivalent of an
    aligned cache line without filling this data to the cache.
    [...] Subsequent MOVNTDQA reads to unread portions of the WC
    cache line will receive data from the temporary internal
    buffer if data is available.

This hidden cache line sized temporary buffer can improve the
read performance from wc maps.

v2: Add mfence at start of tiled_to_linear for streaming loads (Chris)
v3: add Android build support (Tapani)
v4: squash 'fix i915: Fix streaming loads for intel_tiled_memcpy'
    separate sse41 to own static library (Tapani)

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
5 years agoi965: expose type of memcpy instead of memcpy function itself
Tapani Pälli [Wed, 19 Sep 2018 07:16:58 +0000 (10:16 +0300)]
i965: expose type of memcpy instead of memcpy function itself

There is currently no use of returned memcpy functions outside
intel_tiled_memcpy. Patch changes intel_get_memcpy to return memcpy
type instead of actual function. This makes it easier later to separate
streaming load copy in to own static library.

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agoutil: use *unsigned* ints for bit operations
Eric Engestrom [Tue, 16 Oct 2018 08:43:07 +0000 (09:43 +0100)]
util: use *unsigned* ints for bit operations

Fixes errors thrown by GCC's Undefined Behaviour sanitizer (ubsan) every
time this macro is used.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
5 years agoradv: s/abs/fabsf/ for floats
Eric Engestrom [Thu, 18 Oct 2018 14:51:47 +0000 (15:51 +0100)]
radv: s/abs/fabsf/ for floats

Fixes: a4c4efad89eceb26cf82 "radv: Rework guard band calculation"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agomeson: drop option description relic
Eric Engestrom [Thu, 11 Oct 2018 15:38:24 +0000 (16:38 +0100)]
meson: drop option description relic

`platforms` is no longer a comma-separated string, and some of our
option descriptions are way too long already. Just drop the incorrect
bit.

Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
5 years agost/mesa: Record shader access qualifiers for images
Jason Ekstrand [Tue, 2 Oct 2018 03:16:59 +0000 (22:16 -0500)]
st/mesa: Record shader access qualifiers for images

They're not required to be the same as the access flag on the image
unit.  For hardware that does shader image lowering based on the
qualifier (Intel), it may be required for state setup.

v2: (by Kenneth Graunke, incorporating feedback from Marek Olšák)
 - Reduce both access and shader_access to uint16_t to avoid making
   the pipe_image_view structure larger.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agonir/algebraic: Provide descriptive asserts for bit size checks
Jason Ekstrand [Fri, 19 Oct 2018 19:33:36 +0000 (14:33 -0500)]
nir/algebraic: Provide descriptive asserts for bit size checks

This will hopefully make debugging opt_algebraic bit-size compile
failures easier.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
5 years agonir/algebraic: Loosen a restriction on variables
Jason Ekstrand [Fri, 19 Oct 2018 19:31:19 +0000 (14:31 -0500)]
nir/algebraic: Loosen a restriction on variables

Previously, we would fail if a variable had an assigned but unknown bit
size X and we tried to assign it an actual bit size.  However, this is
ok because, at the time we do the search, the variable does have an
actual bit size and it will match X because of the NIR rules.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
5 years agonir/algebraic: A bit of validation refactoring'
Jason Ekstrand [Fri, 19 Oct 2018 19:03:24 +0000 (14:03 -0500)]
nir/algebraic: A bit of validation refactoring'

We rename some local variables in validate() to be more readable and
plumb the var through to get/set_var_bit_class instead of the var index.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
5 years agonir/algebraic: Make internal classes str-able
Jason Ekstrand [Fri, 19 Oct 2018 19:01:31 +0000 (14:01 -0500)]
nir/algebraic: Make internal classes str-able

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
5 years agonir/algebraic: Generalize an optimization
Jason Ekstrand [Fri, 19 Oct 2018 17:43:43 +0000 (12:43 -0500)]
nir/algebraic: Generalize an optimization

There's nothing boolean about (a | ~a) ~> -1

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
5 years agonir/algebraic: Use bool internally instead of bool32
Jason Ekstrand [Fri, 19 Oct 2018 03:31:08 +0000 (22:31 -0500)]
nir/algebraic: Use bool internally instead of bool32

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
5 years agointel: Fix decoding for partial STATE_BASE_ADDRESS updates.
Kenneth Graunke [Mon, 22 Oct 2018 04:41:39 +0000 (21:41 -0700)]
intel: Fix decoding for partial STATE_BASE_ADDRESS updates.

STATE_BASE_ADDRESS only modifies various bases if the "modify" bit is
set.  Otherwise, we want to keep the existing base address.

Iris uses this for updating Surface State Base Address while leaving the
others as-is.

v2: Also update aubinator_viewer_decoder (caught by Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agonir: Use nir_src_is_const and nir_src_as_* in core code
Jason Ekstrand [Sat, 20 Oct 2018 14:10:02 +0000 (09:10 -0500)]
nir: Use nir_src_is_const and nir_src_as_* in core code

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agonir/search_helpers: Use nir_src_is_const and friends
Jason Ekstrand [Sat, 20 Oct 2018 17:07:41 +0000 (12:07 -0500)]
nir/search_helpers: Use nir_src_is_const and friends

This not only makes them safe for more bit sizes but it also fixes a bug
in is_zero_to_one where it would return true for constant NaN.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agonir/search: Use nir_src_is_const and friends
Jason Ekstrand [Sat, 20 Oct 2018 17:17:30 +0000 (12:17 -0500)]
nir/search: Use nir_src_is_const and friends

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agonir: Add some new helpers for working with const sources
Jason Ekstrand [Sat, 20 Oct 2018 13:36:21 +0000 (08:36 -0500)]
nir: Add some new helpers for working with const sources

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agomesa/st: Only call nir_lower_io_to_scalar_early on scalar ISAs
Alyssa Rosenzweig [Sun, 21 Oct 2018 18:29:37 +0000 (11:29 -0700)]
mesa/st: Only call nir_lower_io_to_scalar_early on scalar ISAs

On scalar ISAs, nir_lower_io_to_scalar_early enables significant
optimizations. However, on vector ISAs, it is counterproductive and
impedes optimal codegen. This patch only calls
nir_lower_io_to_scalar_early for scalar ISAs. It appears that at present
there are no upstreamed drivers using Gallium, NIR, and a vector ISA, so
for existing code, this should be a no-op. However, this patch is
necessary for the upcoming Panfrost (Midgard) and Lima (Utgard)
compilers, which are vector.

With this patch, Panfrost is able to consume NIR directly, rather than
TGSI with the TGSI->NIR conversion.

For how this affects Lima, see
https://www.mail-archive.com/mesa-dev@lists.freedesktop.org/msg189216.html

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
5 years agomeson: don't require libelf for r600 without LLVM
Dylan Baker [Mon, 22 Oct 2018 14:26:44 +0000 (07:26 -0700)]
meson: don't require libelf for r600 without LLVM

r600 doesn't have a hard requirement on LLVM, and therefore doesn't have
a hard requirement on libelf. Currently the logic doesn't allow that
however.

Distro-bug: https://bugs.gentoo.org/669058
Fixes: 5060c51b6f4dfb0d5358bde6523285163d3faaad
       ("meson: build r600 driver")
Reviewed-by: Matt Turner <mattst88@gmail.com>
5 years agoanv,radv: Trivially expose two new VK_GOOGLE extensions
Jason Ekstrand [Sat, 13 Oct 2018 13:46:20 +0000 (08:46 -0500)]
anv,radv: Trivially expose two new VK_GOOGLE extensions

This patch exposes support for the following two extensions:

 * VK_GOOGLE_decorate_string
 * VK_GOOGLE_hlsl_functionality1

There's nothing for the driver to do; it's all handled in spirv_to_nir.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107971
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agospirv: Add no-op support for VK_GOOGLE_hlsl_functionality1
Jason Ekstrand [Sat, 13 Oct 2018 13:41:36 +0000 (08:41 -0500)]
spirv: Add no-op support for VK_GOOGLE_hlsl_functionality1

This extension adds two new decorations which carry meaning only for
HLSL shaders.  They are expected to be handled by higher level layers
and can be ignored by implementations.  However, it does save the client
a bit of work if the implementation safely ignores them instead of the
client having to strip them out of the SPIR-V in order for it to be
valid.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agospirv: Add support for SPV_GOOGLE_decorate_string
Jason Ekstrand [Sat, 13 Oct 2018 13:33:22 +0000 (08:33 -0500)]
spirv: Add support for SPV_GOOGLE_decorate_string

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoandroid: Build kms_swrast for the Android platform
Rob Herring [Tue, 24 Jul 2018 09:09:39 +0000 (11:09 +0200)]
android: Build kms_swrast for the Android platform

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agoac: Fix loading a dvec3 from an SSBO
Connor Abbott [Thu, 18 Oct 2018 13:39:13 +0000 (15:39 +0200)]
ac: Fix loading a dvec3 from an SSBO

The comment was wrong, since the loop above casts to a type with the
correct bitsize already.

Fixes: 7e7ee82698247d8f93fe37775b99f4838b0247dd ("ac: add support for 16bit buffer loads")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoac: Introduce ac_build_expand()
Connor Abbott [Thu, 18 Oct 2018 13:30:11 +0000 (15:30 +0200)]
ac: Introduce ac_build_expand()

And implement ac_bulid_expand_to_vec4() on top of it.

Fixes: 7e7ee82698247d8f93fe37775b99f4838b0247dd ("ac: add support for 16bit buffer loads")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoir3/nir: Set up image_dims consts for image_deref_size intrinsic too
Eduardo Lima Mitev [Sun, 21 Oct 2018 18:48:41 +0000 (20:48 +0200)]
ir3/nir: Set up image_dims consts for image_deref_size intrinsic too

`nir_intrinsic_image_deref_size` is not being considered during scan for
driver constants, so image constants are not emitted if a shader
only ever query the size of an image (no load, store, atomic op, etc).
This is unlikely, but possible.

Reviewed-by: Rob Clark <robdclark@gmail.com>
5 years agonv50/ir: fix ConstantFolding::createMul for 64 bit muls
Karol Herbst [Fri, 19 Oct 2018 17:26:39 +0000 (19:26 +0200)]
nv50/ir: fix ConstantFolding::createMul for 64 bit muls

Fixes: 2f52925f5c60c72c9389bfdc122c3d5f8e15b25f
       "nv50/ir: move a * b -> a << log2(b) code into createMul()"

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
5 years agoradeonsi: Disable clear_state with radeon kernel driver
Sonny Jiang [Fri, 19 Oct 2018 20:16:41 +0000 (16:16 -0400)]
radeonsi: Disable clear_state with radeon kernel driver

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
5 years agomeson: Add -Werror=return-type when supported.
Kenneth Graunke [Tue, 30 Jan 2018 09:32:07 +0000 (01:32 -0800)]
meson: Add -Werror=return-type when supported.

This warning detects non-void functions with a missing return statement,
return statements with a value in void functions, and functions with an
bogus return type that ends up defaulting to int.  It's already enabled
by default with -Wall.  Generally, these are fairly serious bugs in the
code, which developers would like to notice and fix immediately.  This
patch promotes it from a warning to an error, to help developers catch
such mistakes early.

I would not expect this warning to change much based on the compiler
version, so hopefully it won't become a problem for packagers/builders.

See the GCC documentation or 'man gcc' for more details:
https://gcc.gnu.org/onlinedocs/gcc-7.3.0/gcc/Warning-Options.html#index-Wreturn-type

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoanv: Define trampolines as the weak functions
Jason Ekstrand [Mon, 15 Oct 2018 03:20:17 +0000 (22:20 -0500)]
anv: Define trampolines as the weak functions

Instead of having weak references to the anv functions and separate
trampoline functions with their own dispatch table, just make the
trampoline functions weak.  This gets rid of a dispatch table and
potentially lets the compiler delete the unused weak function.  The
end result is a reduction in the .text section of 5.7K and a reduction
in the .data section of 1.4K.

Before:

   text    data     bss     dec     hex filename
3190329  282232    8960 3481521  351fb1 _install/lib64/libvulkan_intel.so

After:

   text    data     bss     dec     hex filename
3184548  280792    8960 3474300  35037c _install/lib64/libvulkan_intel.so

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agodocs: fix typo in 18.2.3 release notes link
Juan A. Suarez Romero [Fri, 19 Oct 2018 16:47:45 +0000 (18:47 +0200)]
docs: fix typo in 18.2.3 release notes link

Fixes: 86b4bd52dc ("docs: update calendar, add news item and link
release notes for 18.2.3")

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
5 years agodocs: update calendar, add news item and link release notes for 18.2.3
Juan A. Suarez Romero [Fri, 19 Oct 2018 16:45:41 +0000 (18:45 +0200)]
docs: update calendar, add news item and link release notes for 18.2.3

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
5 years agodocs: add sha256 checksums for 18.2.3
Juan A. Suarez Romero [Fri, 19 Oct 2018 16:43:26 +0000 (18:43 +0200)]
docs: add sha256 checksums for 18.2.3

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit 27fd12857b53ec22c0e918eee6c4c009643fccbc)

5 years agodocs: add release notes for 18.2.3
Juan A. Suarez Romero [Fri, 19 Oct 2018 16:02:51 +0000 (18:02 +0200)]
docs: add release notes for 18.2.3

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit d219361b4226944835959676d1721b2a9d29da72)

5 years agoscons: Remove gles option.
Jose Fonseca [Thu, 18 Oct 2018 14:04:49 +0000 (15:04 +0100)]
scons: Remove gles option.

It's broken, and WGL state tracker is always built with GLES support
noawadays.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agoradv: Fix WSI & PCI bus info initialization order.
Bas Nieuwenhuizen [Fri, 19 Oct 2018 09:51:47 +0000 (11:51 +0200)]
radv: Fix WSI & PCI bus info initialization order.

Trying to access the bus info before it is initialized is not going
to work.

Fixes: baa38c144f6 "vulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matching"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108491
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Andre Heider <a.heider@gmail.com>
5 years agoradeonsi: fix a typo in a comment in emit_guardband
Marek Olšák [Thu, 18 Oct 2018 22:01:00 +0000 (18:01 -0400)]
radeonsi: fix a typo in a comment in emit_guardband

5 years agoradeonsi: fix gnome-shell crash
Marek Olšák [Thu, 18 Oct 2018 21:54:24 +0000 (17:54 -0400)]
radeonsi: fix gnome-shell crash

I wasn't expecting to get viewports with the center having
negative coordinates.

Broken by: 6cc79e4411f

5 years agoRevert "anv: Stop generating weak references for instance entrypoints"
Jason Ekstrand [Mon, 15 Oct 2018 02:56:47 +0000 (21:56 -0500)]
Revert "anv: Stop generating weak references for instance entrypoints"

This reverts commit 00bb42105d6edf6e432c0e3712ffb9d3eb0aece4.  It was
not as well thought out as I had intended and broke the build when
VK_KHR_display is disabled in the build.

5 years agoradeonsi: clamp point size to the limit
Marek Olšák [Wed, 17 Oct 2018 16:26:54 +0000 (12:26 -0400)]
radeonsi: clamp point size to the limit

This fixes dEQP-GLES2.functional.rasterization.limits.points.
Broken by: ea039f789d9b54e1bd1d644b6a29863ca3500314

Tested-by: Jakob Bornecrantz <jakob@collabora.com>
5 years agoradeonsi: fix a VGT hang with primitive restart on Polaris10 and later
Marek Olšák [Tue, 16 Oct 2018 19:10:01 +0000 (15:10 -0400)]
radeonsi: fix a VGT hang with primitive restart on Polaris10 and later

Cc: 18.1 18.2 <mesa-stable@lists.freedesktop.org>
Tested-by: Jakob Bornecrantz <jakob@collabora.com>
5 years agoradeonsi: fix a deadlock due to partially-initialized context on CI
Marek Olšák [Wed, 17 Oct 2018 16:41:38 +0000 (12:41 -0400)]
radeonsi: fix a deadlock due to partially-initialized context on CI

5 years agoradeonsi: Bump number of allowed global buffers to 32
Jan Vesely [Thu, 18 Oct 2018 19:15:06 +0000 (15:15 -0400)]
radeonsi: Bump number of allowed global buffers to 32

Fixes assertion failure/crash when running luxmark/luxball on clover.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108272
CC: mesa-stable@lists.freedesktop.org
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
5 years agoradv: fix check for perftest options size
Andres Rodriguez [Thu, 18 Oct 2018 19:32:31 +0000 (15:32 -0400)]
radv: fix check for perftest options size

It was using the debug options array size.

CC: mesa-stable@lists.freedesktop.org
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradeonsi: fix incorrect hw screen offset and guardband computation
Marek Olšák [Thu, 18 Oct 2018 18:42:42 +0000 (14:42 -0400)]
radeonsi: fix incorrect hw screen offset and guardband computation

It resulted in assertion failures or incorrect rendering.

Broken by: 9e182b8313c5ab952498a76495f57e8420f9e5ad

5 years agovulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matching
Jason Ekstrand [Thu, 18 Oct 2018 15:08:32 +0000 (10:08 -0500)]
vulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matching

This lets us avoid passing the DRM fd around all over the place and gets
us closer to layer utopia.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
5 years agoloader/dri3: Also wait for front buffer fence if we triggered it
Michel Dänzer [Mon, 1 Oct 2018 16:43:46 +0000 (18:43 +0200)]
loader/dri3: Also wait for front buffer fence if we triggered it

In that case, we have to wait for the fence to synchronize with the
corresponding drawing we triggered in the X server.

Fixes incorrect display with the i965 driver and some applications, e.g.
solvespace.

Bugzilla: https://bugs.freedesktop.org/108097
Fixes: aefac10fecc9 "loader/dri3: Only wait for back buffer fences in
                     dri3_get_buffer"
Tested-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
5 years agoanv: Stop generating weak references for instance entrypoints
Jason Ekstrand [Mon, 15 Oct 2018 02:56:47 +0000 (21:56 -0500)]
anv: Stop generating weak references for instance entrypoints

We don't need weak references to instance entrypoints because we never
have more than one of each so we don't need the NULL fall-back.  This
also helps us avoid forgetting things because we now get link errors for
missing instance entrypoints.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agovulkan/wsi: Implement GetPhysicalDevicePresentRectanglesKHR
Jason Ekstrand [Mon, 15 Oct 2018 02:56:34 +0000 (21:56 -0500)]
vulkan/wsi: Implement GetPhysicalDevicePresentRectanglesKHR

This got missed during 1.1 enabling because it was defined as an
interaction between device groups and WSI and it wasn't obvious it was
in the delta.

The idea behind it is that it's supposed to provide a hint to the
application in a multi-GPU setup to indicate which regions of the screen
are being scanned out by which GPU so a multi-device split-screen
rendering application can render each part of the screen on the GPU that
will be presenting it and avoid extra bus traffic between GPUs.  On a
single-GPU setup or one which doesn't support this present mode, we need
to do something.  We choose to return the window size (or a max-size
rect) if the compositor, X server, or crtc is associated with the given
physical device and zero rectangles otherwise.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agovulkan/wsi: Store the instance allocator in wsi_device
Jason Ekstrand [Wed, 17 Oct 2018 19:35:16 +0000 (14:35 -0500)]
vulkan/wsi: Store the instance allocator in wsi_device

We already have wsi_device and we know the instance allocator at
wsi_device_init time so there's no need to pass it into the physical
device queries.  This also fixes a memory allocation domain bug that can
occur if CreateSwapchain gets called prior to any queries (not likely)
in which case the cached connection gets allocated off the device
instead of the instance.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
5 years agost/xlib: Use more appropriate include guard
Michał Janiszewski [Tue, 16 Oct 2018 21:44:22 +0000 (23:44 +0200)]
st/xlib: Use more appropriate include guard

Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com
5 years agogallium: Fix mismatched ifdef-guards
Michał Janiszewski [Tue, 16 Oct 2018 21:44:21 +0000 (23:44 +0200)]
gallium: Fix mismatched ifdef-guards

Signed-off-by: Michał Janiszewski <janisozaur+signed@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
5 years agosoftpipe: dynamically allocate space for immediate constants
Gert Wollny [Tue, 16 Oct 2018 08:07:49 +0000 (10:07 +0200)]
softpipe: dynamically allocate space for immediate constants

The number of immediate constants was fixed and the size check was
only done by means of an assertion. Given this a shader that emits
more immediate constants would result in a memory corruption when
mesa is build in release mode.

Instead of using this fixed limit allocate the space dynamically, let it
grow as needed, and also remove the unused ImmArray.

Fixes: dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.1
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
5 years agoradv: use nir_shrink_vec_array_vars()
Timothy Arceri [Wed, 17 Oct 2018 23:23:42 +0000 (10:23 +1100)]
radv: use nir_shrink_vec_array_vars()

Totals from affected shaders:
SGPRS: 1096 -> 1096 (0.00 %)
VGPRS: 1192 -> 1056 (-11.41 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 100940 -> 94384 (-6.49 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 100 -> 112 (12.00 %)
Wait states: 0 -> 0 (0.00 %)

All affected shaders are from Batman Arkham City.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: use nir_split_array_vars()
Timothy Arceri [Wed, 17 Oct 2018 23:19:16 +0000 (10:19 +1100)]
radv: use nir_split_array_vars()

We call in the opt loop in case another pass results in an
array with indirect access being turned into direct access.

Totals from affected shaders:
SGPRS: 512 -> 496 (-3.12 %)
VGPRS: 456 -> 452 (-0.88 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 40040 -> 39664 (-0.94 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 41 -> 43 (4.88 %)
Wait states: 0 -> 0 (0.00 %)

All affected shaders are from Batman Arkham City.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: use nir_opt_find_array_copies()
Timothy Arceri [Wed, 17 Oct 2018 22:42:17 +0000 (09:42 +1100)]
radv: use nir_opt_find_array_copies()

Totals from affected shaders:
SGPRS: 1112 -> 1112 (0.00 %)
VGPRS: 1492 -> 1196 (-19.84 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 112172 -> 101316 (-9.68 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 93 -> 98 (5.38 %)
Wait states: 0 -> 0 (0.00 %)

All affected shaders are from "Batman: Arkham City" over DXVK.

The pass detects that the temporary array created by DXVK for
storing TCS inputs is a copy of the input arrays and allows
us to avoid copying all of the input data and then indirecting
on it with if-ladders, instead we just do indirect indexing.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agoradv: use nir_opt_copy_prop_vars and nir_opt_dead_write_vars
Timothy Arceri [Wed, 17 Oct 2018 21:55:46 +0000 (08:55 +1100)]
radv: use nir_opt_copy_prop_vars and nir_opt_dead_write_vars

Totals from affected shaders:
SGPRS: 2856 -> 2856 (0.00 %)
VGPRS: 3236 -> 3248 (0.37 %)
Spilled SGPRs: 0 -> 0 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 236560 -> 233548 (-1.27 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 277 -> 283 (2.17 %)
Wait states: 0 -> 0 (0.00 %)

Even in the cases were we have increased VGPR use it appears
the NIR is improved significantly.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agovulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v5]
Keith Packard [Thu, 11 Oct 2018 23:05:18 +0000 (16:05 -0700)]
vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v5]

Offers three clocks, device, clock monotonic and clock monotonic
raw. Could use some kernel support to reduce the deviation between
clock values.

v2:
Ensure deviation is at least as big as the GPU time interval.

v3:
Set device->lost when returning DEVICE_LOST.
Use MAX2 and DIV_ROUND_UP instead of open coding these.
Delete spurious TIMESTAMP in radv version.

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
v4:
Add anv_gem_reg_read to anv_gem_stubs.c

Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
v5:
Adjust maxDeviation computation to max(sampled_clock_period) +
sample_interval.

Suggested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Keith Packard <keithp@keithp.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
5 years agointel/compiler/icl: Use invocation id bits 22:16 instead of 23:17
Topi Pohjolainen [Tue, 16 Oct 2018 11:56:51 +0000 (07:56 -0400)]
intel/compiler/icl: Use invocation id bits 22:16 instead of 23:17

Identifier bits in the dispatch header have changed. See Bspec:

SINGLE_PATCH Payload:

3D Pipeline Stages - 3D Pipeline Geometry -
Hull Shader (HS) Stage IVB+ - Payloads IVB+

Fixes: KHR-GL46.tessellation_shader.tessellation_shader_tc_barriers.barrier_guarded_read_write_calls
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
5 years agoFix setting indent-tabs-mode in the Emacs .dir-locals.el files
Neil Roberts [Wed, 17 Oct 2018 15:27:07 +0000 (17:27 +0200)]
Fix setting indent-tabs-mode in the Emacs .dir-locals.el files

Some of the .dir-locals.el had the wrong name for the truthy value so
it wasn’t setting indent-tabs-mode.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>