Axel Davy [Sun, 23 Sep 2018 14:22:01 +0000 (16:22 +0200)]
st/nine: Capture also default matrices for D3DSBT_ALL
We avoid allocating space for never unused matrices.
However we must do as if we had captured them.
Thus when a D3DSBT_ALL stateblock apply has fewer matrices
than device state, allocate the default matrices for the stateblock
before applying.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Axel Davy [Sun, 23 Sep 2018 14:45:30 +0000 (16:45 +0200)]
st/nine: Mark transform matrices dirty for D3DSBT_ALL
D3DSBT_ALL stateblocks capture the transform matrices.
Fixes some d3d test programs not displaying properly.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Axel Davy [Sun, 23 Sep 2018 20:28:07 +0000 (22:28 +0200)]
st/nine: Don't update unused world matrices
While to the application we have to track
accurately all 256 world matrices (including
in stateblocks), hw vertex processing enables
to set a limit to the number of world matrices
the hardware can access to in the advertised caps,
which is 8 for nine.
Thus don't bother in the stateblock code to send
the updated values for the unreachable matrices.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Axel Davy [Sat, 13 Oct 2018 20:28:34 +0000 (22:28 +0200)]
st/nine: Remove two unused states.
NINE_STATE_MATERIAL was used incorrectly at one location.
Replace it with the correct state.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Axel Davy [Sat, 13 Oct 2018 20:35:22 +0000 (22:35 +0200)]
st/nine: Remove commented nine_context_apply_stateblock
At some point the project was to adapt the
commented version to csmt.
The csmt rework enabled to fix some state aliasing
issues between stateblocks and internal state updates.
The commented version needs a lot of work to work with that.
Just drop it.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Brian Paul [Fri, 26 Oct 2018 18:34:09 +0000 (12:34 -0600)]
nir: Fix array initializer
Empty initializer is not standard C. This fixes MSVC build.
Trivial.
Jason Ekstrand [Fri, 26 Oct 2018 13:32:39 +0000 (08:32 -0500)]
anv: Return VK_ERROR_DEVICE_LOST from anv_device_set_lost
This lets us get rid of a bunch of duplicated error messages.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Jason Ekstrand [Fri, 26 Oct 2018 13:24:49 +0000 (08:24 -0500)]
anv/util: Split a vk_errorv helper out of vk_errorf
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Brian Paul [Fri, 26 Oct 2018 16:23:39 +0000 (10:23 -0600)]
scons/svga: remove opt from the list of valid build types
This reverts commit
a5fd54f8bf6713312fa5efd7ef5cd125557a0ffe.
The whole point was to add a way to pass -DVMX86_STATS to the build,
but we can do that with a command line argument when we invoke scons.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Nanley Chery [Thu, 25 Oct 2018 21:08:52 +0000 (14:08 -0700)]
intel/blorp: Define the clear value bounds for HiZ clears
Follow the restriction of making sure the clear value is between the min
and max values defined in CC_VIEWPORT. Avoids a simulator warning for
some piglit tests, one of them being:
./bin/depthstencil-render-miplevels 146 d=z32f_s8
Jason found this to fix incorrect clearing on SKL.
Fixes: 09948151ab1d5184b4dd9052bb1f710fa1e00a7b
("intel/blorp: Add the BDW+ optimized HZ_OP sequence to BLORP")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Jason Ekstrand <jason@jlekstrand.net>
Eric Engestrom [Thu, 25 Oct 2018 16:36:30 +0000 (17:36 +0100)]
radv: remove duplicate brackets in version string
MESA_GIT_SHA1 resolves to either an empty "" string if not build from git,
or " (git-
DEADBEEF)" if it is. No need to wrap it in additional "()".
Fixes: 9d40ec2cf6ec6d3d9d78 "radv: Add support for VK_KHR_driver_properties."
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Eric Engestrom [Thu, 25 Oct 2018 10:15:38 +0000 (11:15 +0100)]
vulkan: drop always-true param
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Boyuan Zhang [Tue, 23 Oct 2018 15:22:13 +0000 (11:22 -0400)]
radeon/vcn: use util function to get h264 profile idc
Use utility function for converting h264 pipe video profile to profile idc,
instead of using array.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
Boyuan Zhang [Tue, 23 Oct 2018 15:20:33 +0000 (11:20 -0400)]
radeon/vce: use util function to get h264 profile idc
Use utility function for converting h264 pipe video profile to profile idc,
instead of using array.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
Boyuan Zhang [Tue, 23 Oct 2018 15:15:52 +0000 (11:15 -0400)]
vl: get h264 profile idc
Adding a function for converting h264 pipe video profile to profile idc
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Christian König <christian.koenig at amd.com>
Jason Ekstrand [Fri, 19 Oct 2018 17:06:36 +0000 (12:06 -0500)]
intel/nir: Use the OPT macro for more passes
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Jason Ekstrand [Fri, 19 Oct 2018 21:32:15 +0000 (16:32 -0500)]
spirv: Initialize subgroup destinations with the destination type
Instead of initializing them manually, just use the type that we already
have sitting there.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Jason Ekstrand [Sat, 20 Oct 2018 00:08:58 +0000 (19:08 -0500)]
spirv: Use the right bit-size for spec constant ops
Previously, we would always pull the bit size from the destination which
is wrong for opcodes like nir_ilt where the sources are variable-sized
but the destination is a fixed size. We were getting lucky before
because nir_op_ilt returns a 32-bit value and basically everyone who
uses spec constants uses 32-bit ones.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Jason Ekstrand [Fri, 19 Oct 2018 17:12:28 +0000 (12:12 -0500)]
nir/prog: Use nir_bany in kill handling
We have a helper that does exactly what the bany_inequal was doing. It
emits the same code but is a bit higher level and is designed to operate
on a bvec4.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jason Ekstrand [Thu, 18 Oct 2018 22:55:49 +0000 (17:55 -0500)]
glsl/nir: Use i2b instead of ine for fixing UBO/SSBO Booleans
They do the same thing in the end but i2b is a bit simpler. Also, let's
clean up the mess of code for SSBO handling with one line of builder.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jason Ekstrand [Fri, 19 Oct 2018 15:51:47 +0000 (10:51 -0500)]
nir/system_values: Use the bit size from the load_deref
This isn't a great solution for bit-sizes but we don't have a
particularly convenient way to get a bit size from the system value enum
and this keeps the lowering pass from changing it.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jason Ekstrand [Sat, 20 Oct 2018 02:42:22 +0000 (21:42 -0500)]
nir/opt_if: Rework condition propagation
Instead of doing our own constant folding, we just emit instructions and
let constant folding happen. This is substantially simpler and lets us
use the nir_imm_bool helper instead of dealing with the const_value's
ourselves.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Jason Ekstrand [Mon, 22 Oct 2018 19:08:13 +0000 (14:08 -0500)]
nir/search: Use the nir_imm_* helpers from nir_builder
This requires that we rework the interface a bit to use nir_builder but
that's a nice little modernization anyway.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jason Ekstrand [Mon, 22 Oct 2018 19:08:44 +0000 (14:08 -0500)]
nir/builder: Handle 16-bit floats in nir_imm_floatN_t
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jason Ekstrand [Fri, 19 Oct 2018 14:35:49 +0000 (09:35 -0500)]
nir/builder: Add a nir_imm_true/false helpers
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jason Ekstrand [Mon, 22 Oct 2018 20:53:14 +0000 (15:53 -0500)]
nir/constant_folding: Use nir_src_as_bool for discard_if
Missed one while converting to the nir_src_as_* helpers.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jason Ekstrand [Fri, 19 Oct 2018 03:26:03 +0000 (22:26 -0500)]
nir/constant_folding: Add an unreachable to a switch
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jason Ekstrand [Thu, 18 Oct 2018 20:18:30 +0000 (15:18 -0500)]
nir/validate: Print when the validation failed
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Jason Ekstrand [Tue, 13 Mar 2018 19:26:20 +0000 (12:26 -0700)]
anv: Handle the device loss abort in anv_device_set_lost
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Jason Ekstrand [Tue, 13 Mar 2018 18:50:33 +0000 (11:50 -0700)]
anv: Add helpers for setting/checking device lost
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Jason Ekstrand [Thu, 25 Oct 2018 15:13:12 +0000 (10:13 -0500)]
anv: Provide a error message with a DEVICE_LOST
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Alex Smith [Thu, 25 Oct 2018 09:50:52 +0000 (10:50 +0100)]
anv: Fix sanitization of stencil state when the depth test is disabled
When depth testing is disabled, we shouldn't pay attention to the
specified depthCompareOp, and just treat it as always passing. Before,
if the depth test is disabled, but depthCompareOp is VK_COMPARE_OP_NEVER
(e.g. from the app having zero-initialized the structure), then
sanitize_stencil_face() would have incorrectly changed passOp to
VK_STENCIL_OP_KEEP.
v2: Roll the depthTestEnable check into the ds_aspect check below since
they now both do the same thing.
Fixes: 028e1137e6 "anv/pipeline: Be smarter about depth/stencil state"
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Samuel Pitoiset [Wed, 24 Oct 2018 06:50:26 +0000 (08:50 +0200)]
radv: implement image to image operations for R32G32B32
This should address the remaining failures in Batman Arkhman City.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107765
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 24 Oct 2018 06:50:25 +0000 (08:50 +0200)]
radv: fix a comment in radv_meta_buffer_to_image_cs_r32g32b32()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 24 Oct 2018 06:50:24 +0000 (08:50 +0200)]
radv: add get_image_stride_for_r32g32b32() helper
For the special R32G32B32 paths.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 24 Oct 2018 06:50:23 +0000 (08:50 +0200)]
radv: add create_bview_for_r32g32b32() helper
For the special R32G32B32 paths.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 24 Oct 2018 06:50:22 +0000 (08:50 +0200)]
radv: add create_buffer_from_image() helper
For the special R32G32B32 paths.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Sagar Ghuge [Wed, 24 Oct 2018 23:25:53 +0000 (16:25 -0700)]
intel/compiler: Print message descriptor as immediate source
While disassembling send(c) instruction print message descriptor as
immediate source operand along with message descriptor. This allows
assembler to read immediate source operand and set bits accordingly.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Sagar Ghuge [Wed, 24 Oct 2018 20:27:27 +0000 (13:27 -0700)]
intel/compiler: Print hex representation along with floating point value
While encoding the immediate floating point values in instruction we use
values upto precision 9, but while disassembling, we print precision to
6 places, which round up the value and gives wrong interpretation for
encoded immediate constant.
To avoid misinterpretation of encoded immediate values in instruction
and disassembled output, print hex representation along with floating
point value which can be used by assembler in future.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
David McFarland [Wed, 24 Oct 2018 00:51:09 +0000 (21:51 -0300)]
util: Change remaining uint32 cache ids to sha1
After discussion with Timothy Arceri. disk_cache_get_function_identifier
was using only the first byte of the sha1 build-id. Replace
disk_cache_get_function_identifier with implementation from
radv_get_build_id. Instead of writing a uint32_t it now writes to a
mesa_sha1. All drivers using disk_cache_get_function_identifier are
updated accordingly.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Fixes: 83ea8dd99bb1 ("util: add disk_cache_get_function_identifier()")
Hyunjun Ko [Wed, 24 Oct 2018 01:57:15 +0000 (10:57 +0900)]
freedreno: use fd_bc_alloc_batch instead of fd_batch_create.
Following the commit
2385d7b066 and
8e798e28f7, for resource dependancy
tracking.
Fixes: dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo
with FD_MESA_DEBUG=inorder
Signed-off-by: Rob Clark <robdclark@gmail.com>
Hyunjun Ko [Thu, 25 Oct 2018 08:26:19 +0000 (17:26 +0900)]
freedreno/ir3: take reg->num out of union in ir3_register
To avoid wrong result when identifying the type of register.
Ie. If the reg is an array, it might be identified as address or
predicate register.
Fixes: dEQP-GLES31.functional.ssbo.layout.random.arrays_of_arrays.6
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Thu, 25 Oct 2018 19:27:10 +0000 (15:27 -0400)]
freedreno/a6xx: disable unused groups
Don't leave vsconst/fsconst group enabled if we switch to shader with no
uniforms.
Fixes: abcdf5627a2 freedreno/a6xx: move const emit to state group
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Thu, 18 Oct 2018 13:05:52 +0000 (09:05 -0400)]
freedreno: add useful assert
Would have been useful to catch the problem fixed in
8e798e28f736e22e9e1e4534ab42a36cde14b142
Signed-off-by: Rob Clark <robdclark@gmail.com>
Alok Hota [Tue, 16 Oct 2018 23:15:29 +0000 (18:15 -0500)]
swr/rast: ignore CreateElementUnorderedAtomicMemCpy
This function's API changed between LLVM 5 and 6. Compile errors occur
when building with LLVM 6+ if LLVM 5 was used for a dist tarball
CC: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107865
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Alok Hota [Wed, 19 Sep 2018 17:42:57 +0000 (12:42 -0500)]
swr/rast: fix intrinsic/function for LLVM 7 compatibility
Converted from x86 VFMADDPS intrinsic to generic LLVM intrinsic, and
removed createInstructionSimplifierPass, which were both removed in LLVM
7.0.0
These changes combine patches we received from the community and our own
internal patches
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
Rhys Perry [Thu, 4 Oct 2018 12:40:43 +0000 (13:40 +0100)]
nvc0: increase NOUVEAU_TRANSFER_PUSHBUF_THRESHOLD to 1024 on Kepler+
Gives a +3.89% to +5.27% FPS improvement with Hitman and +2.73% to +2.82%
FPS improvement with Dirt Rally on my GTX 1060.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Bas Nieuwenhuizen [Tue, 23 Oct 2018 08:54:24 +0000 (10:54 +0200)]
radv: Emit enqueued pipeline barriers on event write.
Since the CPU can read them we need to execute any GPU->CPU
flushes before the event is written.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108524
Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bas Nieuwenhuizen [Sun, 30 Sep 2018 18:02:04 +0000 (20:02 +0200)]
radv: Add support for VK_KHR_driver_properties.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Eric Engestrom [Sat, 20 Oct 2018 17:00:09 +0000 (18:00 +0100)]
util: use C99 declaration in the for-loop set_foreach() macro
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Eric Engestrom [Sat, 20 Oct 2018 17:00:08 +0000 (18:00 +0100)]
util: use C99 declaration in the for-loop hash_table_foreach() macro
Signed-off-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Dylan Baker [Tue, 23 Oct 2018 17:02:05 +0000 (10:02 -0700)]
gen: Add AMD_gpu_shader_int64.xml to tarball
CC: Ian Romanick <ian.d.romanick@intel.com>
CC: Marek Olšák <marek.olsak@amd.com>
Fixes: b3c17330e631695b5e5dc209ba9ea1a528618c97
("mesa: expose AMD_gpu_shader_int64")
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Dylan Baker [Tue, 23 Oct 2018 17:00:01 +0000 (10:00 -0700)]
gen: Add EXT_vertex_attrib_64bit.xml to dependency lists
Which is also required to put it in the tarball, a requirement for
building with meson from the tarball.
CC: Ian Romanick <ian.d.romanick@intel.com>
CC: Marek Olšák <marek.olsak@amd.com>
Fixes: 263c962cfdee6b43578ee5f28601309ea77d1434
("mesa: expose EXT_vertex_attrib_64bit")
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Eric Engestrom [Tue, 23 Oct 2018 14:37:21 +0000 (15:37 +0100)]
anv: move variable to proper scope and mark as MAYBE_UNUSED
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Engestrom [Tue, 23 Oct 2018 14:27:51 +0000 (15:27 +0100)]
anv: use snprintf() instead of memset()+strcpy()
snprintf() guarantees that it will not write more chars than allowed,
and that the string will be null-terminated, without the need to fill
the whole thing with zeroes to begin with.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Eric Engestrom [Tue, 23 Oct 2018 14:25:45 +0000 (15:25 +0100)]
anv: drop unused includes
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Dylan Baker [Tue, 23 Oct 2018 18:01:12 +0000 (11:01 -0700)]
autotools: include intel_tiled_memcopy.c
There are two problems with the fixed patch. First, it fails to create a
dependency on the sourced .c file, so changes to intel_tiled_memcpy.c
won't trigger a rebuild. It also doesn't get included in the dist
tarball.
Fixes: 11b1afdc92db98e93f2ca50beeb7fc481a11e708
("i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Dylan Baker [Tue, 23 Oct 2018 17:40:15 +0000 (10:40 -0700)]
meson: fix formatting and add extra_files to i965
extra_files is just a nice way to to tell certain IDEs (and those
reading the file) that this file is also a dependency. Meson will use
the .d file generated by the compiler to figure out what the target
actually depends on.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Eduardo Lima Mitev [Tue, 23 Oct 2018 19:24:11 +0000 (21:24 +0200)]
ir3_compiler/nir: fix imageSize() for buffer-backed images
GL_EXT_texture_buffer introduced texture buffers, which can be used
in shaders through a new type imageBuffer.
Because how image access is implemented in freedreno, calling
imageSize on an imageBuffer returns the size in bytes instead of texels,
which is incorrect.
This patch adds a division of imageSize result by the bytes-per-pixel
of the image format, when image is buffer-backed.
Fixes all tests under
dEQP-GLES31.functional.image_load_store.buffer.image_size.*
v2: Pre-compute and submit the log2 of the image format's bpp as shader
constant instead of emitting the LOG2 instruction in code. (Rob Clark)
v3: Use ffs (find-first-bit) helper for computing log2 (Ilia Mirkin)
Reviewed-by: Rob Clark <robdclark@gmail.com>
Jose Fonseca [Wed, 24 Oct 2018 10:33:09 +0000 (11:33 +0100)]
nir: Fix array initializer.
Empty initializer is not standard C. This fixes MSVC build.
Trivial.
Liviu Prodea [Wed, 24 Oct 2018 10:08:35 +0000 (11:08 +0100)]
scons: Put to rest zombie texture_float build option.
I found a remnant of texture_float build option that wasn't removed in
commit
66673bef941af344314fe9c91cad8cd330b245eb
This patch removes it.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Alex Smith [Thu, 18 Oct 2018 16:29:37 +0000 (17:29 +0100)]
anv: Allow presenting via a different GPU
anv_GetPhysicalDeviceSurfaceSupportKHR will already return success for
this, but anv_GetPhysicalDevice{Xcb,Xlib}PresentationSupportKHR do not.
Apps which check for presentation support via the latter (all Feral
Vulkan games at least) will therefore fail.
This allows me to render on an Intel GPU and present to a display
connected to an AMD card (tested HD 530 + Vega 64).
v2: Rebase on current master.
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Juan A. Suarez Romero [Tue, 23 Oct 2018 13:55:11 +0000 (15:55 +0200)]
nir: fix nir_copy_propagation test
Use nir_src_comp_as_uint() to read the proper second component, as
nir_src_as_uint() returns the first one.
v2: Use nir_src_comp_as_uint() [Jason]
Fixes: 16870de8a0a ("nir: Use nir_src_is_const and nir_src_as_* in core
code")
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108532
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Timothy Arceri [Tue, 23 Oct 2018 10:56:31 +0000 (21:56 +1100)]
radv: call nir_link_xfb_varyings()
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Timothy Arceri [Tue, 23 Oct 2018 10:56:30 +0000 (21:56 +1100)]
radv: move nir_lower_io_to_scalar_early() to radv_link_shaders()
nir_lower_io_to_scalar_early() is really part of the link time
optimisations. Moving it here allows the code to be simplified
and also keeps the code easy to follow in the next patch.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Samuel Pitoiset [Tue, 23 Oct 2018 10:56:29 +0000 (21:56 +1100)]
nir: add linking helper nir_link_xfb_varyings()
The linking opts shouldn't try removing or compacting XFB varyings
in the consumer. To avoid this we copy the always_active_io flag
from the producer.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Sagar Ghuge [Sat, 20 Oct 2018 01:25:23 +0000 (18:25 -0700)]
intel/compiler: Change src1 reg type to unsigned doubleword
To have uniform behavior while disassembling send(c) instruction use
register type of unsigned doubleword for src1 when message descriptor is
immediate value. Bspec does not specifiy anything for src1 immediate
default type.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Eduardo Lima Mitev [Tue, 23 Oct 2018 05:56:58 +0000 (07:56 +0200)]
mesa/glformats: Remove redundant helper _mesa_base_format_component_count
There exists _mesa_components_in_format() which already includes
all cases handled in _mesa_base_format_component_count().
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Jason Ekstrand [Mon, 22 Oct 2018 23:29:52 +0000 (18:29 -0500)]
nir/algebraic: Fix a typo in the bit size validation code
The conon_bit_class and canon_var_class variables got switched.
Fixes: 932c650e0b "nir/algebraic: Loosen a restriction on variables"
Reported-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Leo Liu [Tue, 23 Oct 2018 16:57:31 +0000 (12:57 -0400)]
amd/common: check DRM version 3.27 for JPEG decode
JPEG was added after DRM version 3.26
Signed-off-by: Leo Liu <leo.liu@amd.com>
Fixes: 4558758c51749(amd/common: add vcn jpeg ip info query)
Cc: Boyuan Zhang <boyuan.zhang@amd.com>
Cc: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Juan A. Suarez Romero [Fri, 5 Oct 2018 09:14:59 +0000 (11:14 +0200)]
docs: update calendar
I'll take care of 18.2 releases series on Andres behalf.
CC: Andres Gomez <agomez@igalia.com>
CC: Dylan Baker <dylan@pnwbakers.com>
CC: Emil Velikov <emil.l.velikov@gmail.com>
Lionel Landwerlin [Tue, 23 Oct 2018 00:39:39 +0000 (01:39 +0100)]
intel/decoders: fix end of batch limit
Pointer arithmetic...
v2: s/4/sizeof(uint32_t)/ (Eric)
v3: Give bytes to print_batch() in error_decode (Lionel)
Make clear what values we're dealing with in error_decode (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> (v2)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:30 +0000 (15:03 -0400)]
radeonsi: enable vcn jpeg decode for raven
Enable vcn jpeg decode for raven.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:29 +0000 (15:03 -0400)]
winsys/amdgpu: add vcn jpeg cs support
Add vcn jpeg cs support, align cs by no-op.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:28 +0000 (15:03 -0400)]
amd/common: add vcn jpeg ip info query
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:27 +0000 (15:03 -0400)]
radeon/vcn: implement jpeg target buffer cmd
Implement jpeg target buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:26 +0000 (15:03 -0400)]
radeon/vcn: implement jpeg bitstream buffer cmd
Implement jpeg bitstream buffer cmd by programming registers directly,
since there is no firmware for VCN Jpeg decode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:25 +0000 (15:03 -0400)]
radeon/uvd: remove get mjpeg slice header
Move the previous get_mjpeg_slice_heaeder function and eoi from
"radeon/vcn" to "st/va".
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:24 +0000 (15:03 -0400)]
st/va: get mjpeg slice header
Move the previous get_mjpeg_slice_heaeder function and eoi from
"radeon/vcn" to "st/va".
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:23 +0000 (15:03 -0400)]
radeon/vcn: add jpeg decode implementation
Add a new file to handle VCN Jpeg decode specific functions. Use Jpeg
specific cmd sending function in end_frame call.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:22 +0000 (15:03 -0400)]
radeon/vcn: separate send cmd call from end frame
Use function pointer for sending cmd in end_frame call. By doing this, we can
assign different cmd sending logics for Jpeg decode later.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:21 +0000 (15:03 -0400)]
radeon/vcn: create cs based on ring type
Add RING_VCN_JPEG for VCN Jpeg decode, and keep RING_VCN_DEC for other codecs.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:20 +0000 (15:03 -0400)]
radeon/winsys: add vcn jpeg ring type
Add a new ring type for vcn jpeg.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:19 +0000 (15:03 -0400)]
radeon/vcn: add vcn jpeg decode interface
Add VCN Jpeg decode interfaces and register defines.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:18 +0000 (15:03 -0400)]
radeon/vcn: move radeon decoder define to header file
Move radeon_decoder definition from "radeon_vcn_dec.c" to "radeon_vcn_dec.h",
so that it can be included by other files later.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:17 +0000 (15:03 -0400)]
meson: update required amdgpu version to 2.4.95
VCN jpeg requires new hw ip
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Wed, 17 Oct 2018 19:03:16 +0000 (15:03 -0400)]
configure.ac: update libdrm amdgpu version to 2.4.95
VCN jpeg requires new hw ip
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Samuel Pitoiset [Mon, 22 Oct 2018 13:42:31 +0000 (15:42 +0200)]
radv: fix btoi for R32G32B32 when the dest offset is not 0
Fixes: 593996bc02 ("radv: implement buffer to image operations for R32G32B32")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Scott D Phillips [Mon, 24 Sep 2018 08:39:33 +0000 (11:39 +0300)]
i965/miptree: Use cpu tiling/detiling when mapping
Rename the (un)map_gtt functions to (un)map_map (map by
returning a map) and add new functions (un)map_tiled_memcpy that
return a shadow buffer populated with the intel_tiled_memcpy
functions.
Tiling/detiling with the cpu will be the only way to handle Yf/Ys
tiling, when support is added for those formats.
v2: Compute extents properly in the x|y-rounded-down case (Chris Wilson)
v3: Add units to parameter names of tile_extents (Nanley Chery)
Use _mesa_align_malloc for the shadow copy (Nanley)
Continue using gtt maps on gen4 (Nanley)
v4: Use streaming_load_memcpy when detiling
v5: (edited by Ken) Move map_tiled_memcpy above map_movntdqa, so it
takes precedence. Add intel_miptree_access_raw, needed after
rebasing on commit
b499b85b0f2cc0c82b7c9af91502c2814fdc8e67.
v6: refactor to changes done for sse41 separation (Tapani)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v5)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Scott D Phillips [Mon, 24 Sep 2018 05:33:06 +0000 (08:33 +0300)]
i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear
The reference for MOVNTDQA says:
For WC memory type, the nontemporal hint may be implemented by
loading a temporary internal buffer with the equivalent of an
aligned cache line without filling this data to the cache.
[...] Subsequent MOVNTDQA reads to unread portions of the WC
cache line will receive data from the temporary internal
buffer if data is available.
This hidden cache line sized temporary buffer can improve the
read performance from wc maps.
v2: Add mfence at start of tiled_to_linear for streaming loads (Chris)
v3: add Android build support (Tapani)
v4: squash 'fix i915: Fix streaming loads for intel_tiled_memcpy'
separate sse41 to own static library (Tapani)
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> (v2)
Reviewed-by: Matt Turner <mattst88@gmail.com> (v2)
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Tapani Pälli [Wed, 19 Sep 2018 07:16:58 +0000 (10:16 +0300)]
i965: expose type of memcpy instead of memcpy function itself
There is currently no use of returned memcpy functions outside
intel_tiled_memcpy. Patch changes intel_get_memcpy to return memcpy
type instead of actual function. This makes it easier later to separate
streaming load copy in to own static library.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Eric Engestrom [Tue, 16 Oct 2018 08:43:07 +0000 (09:43 +0100)]
util: use *unsigned* ints for bit operations
Fixes errors thrown by GCC's Undefined Behaviour sanitizer (ubsan) every
time this macro is used.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Eric Engestrom [Thu, 18 Oct 2018 14:51:47 +0000 (15:51 +0100)]
radv: s/abs/fabsf/ for floats
Fixes: a4c4efad89eceb26cf82 "radv: Rework guard band calculation"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Eric Engestrom [Thu, 11 Oct 2018 15:38:24 +0000 (16:38 +0100)]
meson: drop option description relic
`platforms` is no longer a comma-separated string, and some of our
option descriptions are way too long already. Just drop the incorrect
bit.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Jason Ekstrand [Tue, 2 Oct 2018 03:16:59 +0000 (22:16 -0500)]
st/mesa: Record shader access qualifiers for images
They're not required to be the same as the access flag on the image
unit. For hardware that does shader image lowering based on the
qualifier (Intel), it may be required for state setup.
v2: (by Kenneth Graunke, incorporating feedback from Marek Olšák)
- Reduce both access and shader_access to uint16_t to avoid making
the pipe_image_view structure larger.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Jason Ekstrand [Fri, 19 Oct 2018 19:33:36 +0000 (14:33 -0500)]
nir/algebraic: Provide descriptive asserts for bit size checks
This will hopefully make debugging opt_algebraic bit-size compile
failures easier.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Fri, 19 Oct 2018 19:31:19 +0000 (14:31 -0500)]
nir/algebraic: Loosen a restriction on variables
Previously, we would fail if a variable had an assigned but unknown bit
size X and we tried to assign it an actual bit size. However, this is
ok because, at the time we do the search, the variable does have an
actual bit size and it will match X because of the NIR rules.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Fri, 19 Oct 2018 19:03:24 +0000 (14:03 -0500)]
nir/algebraic: A bit of validation refactoring'
We rename some local variables in validate() to be more readable and
plumb the var through to get/set_var_bit_class instead of the var index.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Fri, 19 Oct 2018 19:01:31 +0000 (14:01 -0500)]
nir/algebraic: Make internal classes str-able
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Jason Ekstrand [Fri, 19 Oct 2018 17:43:43 +0000 (12:43 -0500)]
nir/algebraic: Generalize an optimization
There's nothing boolean about (a | ~a) ~> -1
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>