Bas Nieuwenhuizen [Thu, 19 Apr 2018 05:29:03 +0000 (07:29 +0200)]
radv: Add bound checking workaround for dynamic buffers.
I have seen a few applications and games do the dynamic buffer bounds incorrectly, this
make it easier to work around, e.g. for debugging.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Thomas Hellstrom [Thu, 12 Apr 2018 12:41:47 +0000 (14:41 +0200)]
svga: Fix incorrect advertizing of EGL_KHR_gl_colorspace
When advertizing this extension, egl_dri2 uses the DRI2_RENDERER_QUERY
extension to query whether an sRGB format is supported. That extension will
query our driver with the BIND flag PIPE_BIND_RENDER_TARGET rather than
PIPE_BIND_DISPLAY_TARGET which is used when building the configs.
We only return the correct value for PIPE_BIND_DISPLAY_TARGET.
The inconsistency causes EGL to crash at surface initialization if sRGB is
not supported. Fix this by supporting both bind flags.
Testing done:
piglit egl_gl_colorspace srgb
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Mike Lothian [Wed, 4 Apr 2018 08:22:54 +0000 (09:22 +0100)]
swr: Fix include for createPromoteMemoryToRegisterPass
Include llvm/Transforms/Utils.h with the newest LLVM 7
v2: Include with " " rather than < > (Vinson Lee)
v3: Use LLVM_VERSION_MAJOR rather than HAVE_LLVM (George Kyriazis)
Signed-of-by: Mike Lothian <mike@fireburn.co.uk>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-By: George Kyriazis <george.kyriazis@intel.com>
Samuel Pitoiset [Tue, 17 Apr 2018 14:05:18 +0000 (16:05 +0200)]
radv: enable DCC for MSAA 2x textures on VI under an option
This can be enabled with RADV_PERFTEST=dccmsaa.
DCC for MSAA textures is actually not as easy to implement. It
looks like there is some corner cases. I will improve support
incrementally.
Vega support, as well as Polaris improvements, will be added later.
No CTS changes on Polaris using RADV_DEBUG=zerovram and
RADV_PERFTEST=dccmsaa.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 17 Apr 2018 14:05:17 +0000 (16:05 +0200)]
radv: decompress DCC for multisampled source images before resolving
Multisampled source images (ie. color attachments) can be now
DCC compressed, so the driver needs to perform a DCC decompression
pass before resolving
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 17 Apr 2018 14:05:16 +0000 (16:05 +0200)]
radv: add a workaround for fast clears with DCC and MSAA textures
This should be fixed at some point in order to improve
performance.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 17 Apr 2018 14:05:15 +0000 (16:05 +0200)]
radv: allocate CMASK for DCC fast clear with MSAA
CMASK is required because it should be cleared to
0xCCCCCCCC for MSAA textures.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 17 Apr 2018 14:05:14 +0000 (16:05 +0200)]
radv: implement fast color clear for DCC with MSAA
When DCC is enabled with MSAA textures, CMASK should be
cleared to 0xCCCCCCCC.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Tue, 17 Apr 2018 13:08:11 +0000 (15:08 +0200)]
radv: make sure to sync after resolving using the compute path
This fixes some random CTS failures:
dEQP-VK.renderpass.multisample.*.
Performing a fast-clear eliminate is still useless, but it
seems that we need to sync.
Found while running CTS with RADV_DEBUG=zerovram.
Fixes: 56a171a499c ("radv: don't fast-clear eliminate after resolving a subpass with compute")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Samuel Pitoiset [Wed, 18 Apr 2018 16:53:44 +0000 (18:53 +0200)]
radv: dump the SHA1 of SPIRV in the hang report
Might be useful for debugging purposes, especially when we
want to replace a shader on the fly.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Bas Nieuwenhuizen [Wed, 11 Apr 2018 17:08:30 +0000 (19:08 +0200)]
radv: Enable VK_EXT_descriptor_indexing.
This adds everything except non-uniform indexing, which needs a bit
more work and testing.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bas Nieuwenhuizen [Wed, 11 Apr 2018 23:36:22 +0000 (01:36 +0200)]
spirv: Add support for runtime descriptor array cap.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bas Nieuwenhuizen [Wed, 11 Apr 2018 23:34:29 +0000 (01:34 +0200)]
spirv: Add support for VK_EXT_descriptor_indexing uniform indexing caps.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bas Nieuwenhuizen [Mon, 9 Apr 2018 23:06:47 +0000 (01:06 +0200)]
radv: Support allocating variable size descriptor sets.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bas Nieuwenhuizen [Mon, 9 Apr 2018 22:00:22 +0000 (00:00 +0200)]
radv: Add support for variable descriptor set layouts.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bas Nieuwenhuizen [Mon, 9 Apr 2018 21:36:19 +0000 (23:36 +0200)]
radv: Fix GetDescriptorSetLayoutSupport.
The continue means we do alignment differently than during creation,
making the buffer smaller than expected.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bas Nieuwenhuizen [Mon, 9 Apr 2018 21:16:55 +0000 (23:16 +0200)]
radv: Use sorted bindings for set layout creation.
Previously we did not care about havin the set storage in order,
but for variable descriptor count we want the highest binding
at the end of the storage.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bas Nieuwenhuizen [Mon, 9 Apr 2018 11:02:14 +0000 (13:02 +0200)]
radv: Don't store buffer references in the descriptor set.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bas Nieuwenhuizen [Mon, 9 Apr 2018 10:46:49 +0000 (12:46 +0200)]
radv: Keep a global BO list for VkMemory.
With update after bind we can't attach bo's to the command buffer
from the descriptor set anymore, so we have to have a global BO
list.
I am somewhat surprised this works really well even though we have
implicit synchronization in the WSI based on the bo list associations
and with the new behavior every command buffer is associated with
every swapchain image. But I could not find slowdowns in games because
of it.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Bas Nieuwenhuizen [Sun, 8 Apr 2018 11:03:06 +0000 (13:03 +0200)]
spirv: Update spirv.h to
12f8de9f04327336b699b1b80aa390ae7f9ddbf4
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Kenneth Graunke [Fri, 13 Apr 2018 18:48:06 +0000 (11:48 -0700)]
i965: Fix shadow batches to be the same size as the real BO.
brw_bo_alloc may round up our allocation size to the next bucket size.
In this case, we would malloc a shadow buffer that was the original
intended size, but use bo->size (the larger size) for all of our checks.
This could cause us to run off the end of the shadow buffer.
v2: Actually use the new BO size (caught by Lionel)
Reported-by: James Xiong <james.xiong@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c7dcee58b5fe183e1653c13bff6a212f0d157b29 (i965: Avoid problems from referencing orphaned BOs after growing.)
Marek Olšák [Fri, 13 Apr 2018 19:18:26 +0000 (15:18 -0400)]
glsl_to_tgsi: try harder to lower unsupported ir_binop_vector_extract
This fixes some piglits.
Cc: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Leo Liu [Wed, 22 Nov 2017 18:31:53 +0000 (13:31 -0500)]
radeon/vce: disable vce dual pipe on VegaM
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 27 Feb 2017 22:28:07 +0000 (23:28 +0100)]
radeonsi: add support for VegaM
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 7 Nov 2017 01:00:03 +0000 (02:00 +0100)]
amd/addrlib: add support for VegaM
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 17 Apr 2018 19:28:04 +0000 (15:28 -0400)]
radeonsi/gfx9: fix a hang with an empty first IB
This packet causes the no-op IB detection to fail, so the IB is always
submitted. Also fix the no-op IB detection by moving the begin call.
Cc: 18.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Dylan Baker [Thu, 5 Apr 2018 21:39:13 +0000 (14:39 -0700)]
meson: build graw tests
This only enables the null and xlib target, so no windows support yet.
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Dylan Baker [Tue, 6 Feb 2018 23:46:25 +0000 (15:46 -0800)]
meson: build tests for gallium mesa state tracker
v2: - Fix typo
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Dylan Baker [Thu, 11 Jan 2018 00:13:52 +0000 (16:13 -0800)]
meson: build gallium unit tests
v2: - gate unit tests on swrast being enabled (Eric A)
v3: - rebase on libtrace being merged with gallium auxiliary
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net> (v2)
Dylan Baker [Thu, 11 Jan 2018 00:07:11 +0000 (16:07 -0800)]
meson: Build gallium trivial tests
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Dylan Baker [Wed, 10 Jan 2018 23:18:54 +0000 (15:18 -0800)]
meson: Remove TODO about mesa/main tests
They're already done.
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Dylan Baker [Thu, 11 Jan 2018 22:41:42 +0000 (14:41 -0800)]
meson: enable glcpp test
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Dylan Baker [Tue, 9 Jan 2018 23:26:39 +0000 (15:26 -0800)]
glcpp/tests: Convert shell scripts to a python script
This ports glcpp-test.sh and glcpp-test-cr-lf.sh to a python script that
accepts arguments for each line ending type. This should allow for
better reporting to users.
v2: - Use $PYTHON2 to be consistent with other tests in mesa
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Dylan Baker [Thu, 11 Jan 2018 22:32:53 +0000 (14:32 -0800)]
glsl/tests: Remove unused compare_ir.py script
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Dylan Baker [Thu, 11 Jan 2018 22:32:40 +0000 (14:32 -0800)]
meson: enable optimization-test
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Dylan Baker [Sat, 9 Dec 2017 01:45:03 +0000 (17:45 -0800)]
glsl/tests: Convert optimization-test.sh to pure python
This patch converts optimization-test.sh to python, in this process it
removes external shell dependencies including diff. It replaces the
python script that generates shell scripts with a python library that
generates test cases and runs them using subprocess.
v2: - use $PYTHON2 to be consistent with other tests in mesa
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Dylan Baker [Sat, 9 Dec 2017 01:45:03 +0000 (17:45 -0800)]
meson: run glsl compiler warnings test
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Dylan Baker [Sat, 9 Dec 2017 01:25:50 +0000 (17:25 -0800)]
glsl/tests: reimplement warnings-test in python
This reimplements the test in python with a shell script wrapper that
allows autotools to continue to run the test without realizing that
anything has changed.
Using python has two advantages, first it's portable so this test can be
run on windows as well as Linux since it just requires python, no more
diff, pwd or sh. It's also no longer tied to autotools implementation
details, like the environment variables $srcdir and $abs_builddir,
though the autotools shell wrapper still uses those, which makes it
possible to run the test in meson.
v2: - Use $PYTHON2 in script to be consistent with other scripts in mesa
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
George Kyriazis [Tue, 10 Apr 2018 22:49:19 +0000 (17:49 -0500)]
swr/rast: Fix VGATHERPD lowering
Also Implement VHSUBPS in x86 lowering pass.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Tue, 10 Apr 2018 17:03:41 +0000 (12:03 -0500)]
swr/rast: Replace x86 VMOVMSK with llvm-only implementation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Tue, 10 Apr 2018 06:05:19 +0000 (01:05 -0500)]
swr/rast: Optimize late/bindless JIT of samplers
Add per-worker thread private data to all shader calls
Add per-worker sampler cache and jit context
Add late LoadTexel JIT support
Add per-worker-thread Sampler / LoadTexel JIT
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Mon, 9 Apr 2018 22:21:46 +0000 (17:21 -0500)]
swr/rast: Implement VROUND intrinsic in x86 lowering pass
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Mon, 9 Apr 2018 18:35:43 +0000 (13:35 -0500)]
swr/rast: Refactor to improve code sharing.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Mon, 9 Apr 2018 17:51:14 +0000 (12:51 -0500)]
swr/rast: minimize codegen redundant work
Move filtering of redundant codegen operations into gen scripts themselves
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Mon, 9 Apr 2018 16:47:37 +0000 (11:47 -0500)]
swr/rast: double-pump in x86 lowering pass
Add support for double-pumping a smaller SIMD width intrinsic.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Fri, 6 Apr 2018 21:39:09 +0000 (16:39 -0500)]
swr/rast: Fix 64bit float loads in x86 lowering pass
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Fri, 6 Apr 2018 20:48:00 +0000 (15:48 -0500)]
swr/rast: Add shader stats infrastructure (WIP)
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Fri, 6 Apr 2018 20:03:09 +0000 (15:03 -0500)]
swr/rast: Type-check TemplateArgUnroller
Allows direct use of enum values in conversion to template args.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Fri, 6 Apr 2018 18:19:01 +0000 (13:19 -0500)]
swr/rast: Add vgather to x86 lowering pass.
Add support for generic VGATHERPD intrinsic in x86 lowering pass.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Fri, 6 Apr 2018 19:16:12 +0000 (14:16 -0500)]
swr/rast: fix comment
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Thu, 5 Apr 2018 22:51:02 +0000 (17:51 -0500)]
swr/rast: add cvt instructions in x86 lowering pass
Support generic VCVTPD2PS and VCVTPH2PS in x86 lowering pass.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Thu, 5 Apr 2018 20:59:54 +0000 (15:59 -0500)]
swr/rast: Fix alloca usage in jitter
Fix issue where temporary allocas were getting hoisted to function entry
unnecessarily. We now explicitly mark temporary allocas and skip hoisting
during the hoist pass. Shuold reduce stack usage.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Thu, 5 Apr 2018 17:08:15 +0000 (12:08 -0500)]
swr/rast: Change gfx pointers to gfxptr_t
Changing type to gfxptr for indices and related changes to fetch and mem
builder code.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Tue, 10 Apr 2018 00:47:51 +0000 (19:47 -0500)]
swr/rast: Fix byte offset for non-indexed draws
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Wed, 4 Apr 2018 22:34:54 +0000 (17:34 -0500)]
swr/rast: Add support for setting optimization level
for JIT compilation
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Thu, 29 Mar 2018 19:43:06 +0000 (14:43 -0500)]
swr/rast: Adding translate call to builder_gfx_mem.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Wed, 28 Mar 2018 19:43:09 +0000 (14:43 -0500)]
swr/rast: Fix codegen for typedef types
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Wed, 28 Mar 2018 19:31:20 +0000 (14:31 -0500)]
swr: add x86 lowering pass to fragment shader
Needed because some FP paths (namely stipple) use gather intrinsics
that now need to be lowered to x86.
v2: fix typo in commit message
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Fri, 23 Mar 2018 20:14:58 +0000 (15:14 -0500)]
swr/rast: Enable generalized fetch jit
Enable generalized fetch jit with 8 or 16 wide SIMD target. Still some
work needed to remove some simd8 double pumping for 16-wide target.
Also removed unused non-gather load vertices path.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Mon, 26 Mar 2018 18:29:04 +0000 (13:29 -0500)]
swr/rast: Add builder_gfx_mem.{h|cpp}
Abstract usage scenarios for memory accesses into builder_gfx_mem.
Builder_gfx_mem will convert gfxptr_t from 64-bit int to regular pointer
types for use by builder_mem.
v2: reworded commit message; renamed enum more appropriately
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Thu, 22 Mar 2018 20:25:36 +0000 (15:25 -0500)]
swr/rast: Lower VGATHERPS and VGATHERPS_16 to x86.
Some more work to do before we can support simultaneous 8-wide and
16-wide and remove the VGATHERPS_16 version.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Wed, 21 Mar 2018 18:23:23 +0000 (13:23 -0500)]
swr/rast: Cleanup of JitManager convenience types
Small cleanup. Remove convenience types from JitManager and standardize
on the Builder's convenience types.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Tue, 20 Mar 2018 23:13:35 +0000 (18:13 -0500)]
swr/rast: Lower PERMD and PERMPS to x86.
Add support for providing an emulation callback function for arch/width
combinations that don't map cleanly to an x86 intrinsic.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Tue, 20 Mar 2018 00:05:38 +0000 (19:05 -0500)]
swr/rast: Start refactoring of builder/packetizer.
Move x86 intrinsic lowering to a separate pass. Builder now instantiates
generic intrinsics for features not supported by llvm. The separate x86
lowering pass is responsible for lowering to valid x86 for the target
SIMD architecture. Currently it's a port of existing code to get it
up and running quickly. Will eventually support optimized x86 for AVX,
AVX2 and AVX512.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Mon, 19 Mar 2018 22:46:13 +0000 (17:46 -0500)]
swr/rast: Simplify #define usage in gen source file
Removed preprocessor defines from structures passed to LLVM jitted code.
The python scripts do not understand the preprocessor defines and ignores
them. So for fields that are compiled out due to a preprocessor define
the LLVM script accounts for them anyway because it doesn't know what
the defines are set to. The sanitize defines for open source are fine
in that they're safely used.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Fri, 16 Mar 2018 15:26:25 +0000 (10:26 -0500)]
swr/rast: Move CallPrint() to a separate file
Needed work for jit code debug.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Thu, 15 Mar 2018 22:49:54 +0000 (17:49 -0500)]
swr/rast: Fix name mangling for LLVM pow intrinsic
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Thu, 15 Mar 2018 20:58:10 +0000 (15:58 -0500)]
swr/rast: Add some archrast counters
Hook up archrast counters for shader stats: instructions executed.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Thu, 15 Mar 2018 18:43:08 +0000 (13:43 -0500)]
swr/rast: Code cleanup
Removing some code that doesn't seem to do anything meaningful.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Thu, 15 Mar 2018 17:49:51 +0000 (12:49 -0500)]
swr/rast: Add "Num Instructions Executed" stats intrinsic.
Added a SWR_SHADER_STATS structure which is passed to each shader. The
stats pass will instrument the shader to populate this.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Thu, 15 Mar 2018 17:08:00 +0000 (12:08 -0500)]
swr/rast: Add MEM_ADD helper function to Builder.
mem[offset] += value
This function will be heavily used by all stats intrinsics.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Wed, 14 Mar 2018 18:38:18 +0000 (13:38 -0500)]
swr/rast: Permute work for simd16
Fix slow permutes in PA tri lists under SIMD16 emulation on AVX
Added missing permute (interlane, immediate) to SIMDLIB
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Wed, 14 Mar 2018 17:29:04 +0000 (12:29 -0500)]
swr/rast: WIP builder rewrite (2)
Finish up the remaining explicit intrinsic uses. At this point all
explicit Intrinsic::getDeclaration() usage has been replaced with auto
generated macros generated with gen_llvm_ir_macros.py. Going forward,
make sure to only use the intrinsics here, adding new ones as needed.
Next step is to remove all references to x86 intrinsics to keep the
builder target-independent. Any x86 lowering will be handled by a
separate pass.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Tue, 13 Mar 2018 18:46:41 +0000 (13:46 -0500)]
swr/rast: Add autogen of helper llvm intrinsics.
Replace sqrt, maskload, fp min/max, cttz, ctlz with llvm equivalent.
Replace AVX maskedstore intrinsic with LLVM intrinsic. Add helper llvm
macros for stacksave, stackrestore, popcnt.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Mon, 12 Mar 2018 18:18:56 +0000 (13:18 -0500)]
swr/rast: WIP builder rewrite.
Start removing avx2 macros for functionality that exists in llvm.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Tue, 13 Mar 2018 01:34:19 +0000 (20:34 -0500)]
swr/rast: LLVM 6 fix
for getting masked gather intrinsic (also compatible with LLVM 4)
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Sat, 10 Mar 2018 06:04:11 +0000 (00:04 -0600)]
swr/rast: Changes to allow jitter to compile with LLVM5
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Wed, 7 Mar 2018 01:32:53 +0000 (19:32 -0600)]
swr/rast: Add some archrast stats
Add stats for degenerate and backfacing primitive counts
Wire archrast stats for alpha blend and alpha test.
pass value to jitter, upon return have archrast event increment a value
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Fri, 9 Mar 2018 17:37:57 +0000 (11:37 -0600)]
swr/rast: Silence some unused variable warnings
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Thu, 8 Mar 2018 22:19:36 +0000 (16:19 -0600)]
swr/rast: Add debug type info for i128
Help support debug info in 16 wide shaders.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Thu, 8 Mar 2018 07:35:17 +0000 (01:35 -0600)]
swr/rast: Use blend context struct to pass params
Stuff parameters into a blend context struct before passing down through
the PFN_BLEND_JIT_FUNC function pointer. Needed for stat changes.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Wed, 7 Mar 2018 19:33:44 +0000 (13:33 -0600)]
swr/rast: Introduce JIT_MEM_CLIENT
Add assert for correct usage of memory accesses
v2: reworded commit message; renamed enum more appropriately
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
George Kyriazis [Wed, 7 Mar 2018 18:00:52 +0000 (12:00 -0600)]
swr/rast: Add some instructions to jitter
VPHADDD, PMAXUD, PMINUD
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Juan A. Suarez Romero [Wed, 18 Apr 2018 15:29:12 +0000 (15:29 +0000)]
docs: update calendar, add news and link release notes to 18.0.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Juan A. Suarez Romero [Wed, 18 Apr 2018 15:25:00 +0000 (15:25 +0000)]
docs: add sha256 checksums for 18.0.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit
a1c421c638fd9ff2810b2a59f1ccd0a3a03657b1)
Juan A. Suarez Romero [Wed, 18 Apr 2018 14:44:49 +0000 (14:44 +0000)]
docs: add release notes for 18.0.1
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit
8bd719e3faee8cb0054f51cf1fe9d372a9eea0ea)
Juan A. Suarez Romero [Wed, 18 Apr 2018 09:45:04 +0000 (09:45 +0000)]
docs: update calendar, add news and link release notes to 17.3.9
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Juan A. Suarez Romero [Wed, 18 Apr 2018 09:39:48 +0000 (09:39 +0000)]
docs: add sha256 checksums for 17.3.9
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit
cf0864dc63caf1285bdede364e9a39b22bac5938)
Juan A. Suarez Romero [Wed, 18 Apr 2018 08:40:26 +0000 (08:40 +0000)]
docs: add release notes for 17.3.9
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
(cherry picked from commit
6d88ea9dd46e630ee861e773dfe4a49f5d1c1fbd)
Dylan Baker [Tue, 17 Apr 2018 20:47:17 +0000 (13:47 -0700)]
Revert "meson: add wrap for libdrm"
This reverts commit
6217eedc9bac86856d5048c43b5f5a3f6976c13e.
I was using this for testing and accidentally put it on master
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Dylan Baker [Tue, 17 Apr 2018 20:47:06 +0000 (13:47 -0700)]
Revert "Add subprojects directory and git ignore"
This reverts commit
21e2e73f71096fd4607051c060cf82c593663d50.
I was using this for testing and accidentally put it on master
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Jan Alexander Steffens (heftig) [Sat, 14 Apr 2018 17:23:22 +0000 (19:23 +0200)]
meson: Version libMesaOpenCL like autotools does
This is for parity with autotools. It names the library
libMesaOpenCL.so.1.0.0 and points mesa.icd to the .1 symlink.
opencl_version now matches configure.ac's OPENCL_VERSION.
Signed-off-by: Jan Alexander Steffens (heftig) <jan.steffens@gmail.com>
Tested-By: Aaron Watry <awatry@gmail.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Jan Alexander Steffens (heftig) [Sat, 14 Apr 2018 17:23:21 +0000 (19:23 +0200)]
meson: Add library versions to swr drivers
This is for parity with autotools.
Signed-off-by: Jan Alexander Steffens (heftig) <jan.steffens@gmail.com>
Acked-by: Dylan Baker <dylan@pnwbakers.com>
Dylan Baker [Fri, 13 Apr 2018 19:18:10 +0000 (12:18 -0700)]
meson: add wrap for libdrm
Currently this requires libdrm from git, since the version reported by
meson is wrong.
Dylan Baker [Fri, 13 Apr 2018 19:04:57 +0000 (12:04 -0700)]
Add subprojects directory and git ignore
For meson wraps.
Samuel Pitoiset [Tue, 17 Apr 2018 20:07:26 +0000 (22:07 +0200)]
radv: fix scissor computation when using half-pixel viewport offset
'scale[i]' can be non-integer.
Original patch by Philip Rebohle.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106074
Fixes: 0f3de89a56a ("radv: Use the guard band.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Niuwenhuizen <bas@basnieuwenhuizen.nl>
Neil Roberts [Wed, 21 Mar 2018 19:34:40 +0000 (20:34 +0100)]
spirv: Accept doubles in FaceForward, Reflect and Refract
The SPIR-V spec doesn’t specify a size requirement for these and the
equivalent functions in the GLSL spec have explicit alternatives for
doubles. Refract is a little bit more complicated due to the fact that
the final argument is always supposed to be a scalar 32- or 16- bit
float regardless of the other operands. However in practice it seems
there is a bug in glslang that makes it convert the argument to 64-bit
if you actually try to pass it a 32-bit value while the other
arguments are 64-bit. This adds an optional conversion of the final
argument in order to support any type.
These have been tested against the automatically generated tests of
glsl-4.00/execution/built-in-functions using the ARB_gl_spirv branch
which tests it with quite a large range of combinations.
The issue with glslang has been filed here:
https://github.com/KhronosGroup/glslang/issues/1279
v2: Convert the eta operand of Refract from any size in order to make
it eventually cope with 16-bit floats.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Neil Roberts [Wed, 21 Mar 2018 19:34:39 +0000 (20:34 +0100)]
spirv: Add a 64-bit implementation of OpIsInf
The only change neccessary is to change the type of the constant used
to compare against.
This has been tested against the arb_gpu_shader_fp64/execution/
fs-isinf-dvec tests using the ARB_gl_spirv branch.
v2: Use nir_imm_floatN_t for the constant.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Neil Roberts [Wed, 21 Mar 2018 19:34:38 +0000 (20:34 +0100)]
spirv: Use nir_imm_floatN_t for constants for GLSL450 builtins
There is an existing macro that is used to choose between either a
float or a double immediate constant based on the bit size of the
first operand to the builtin. This is now changed to use the new
nir_imm_floatN_t helper function to reduce the number of places that
make this decision.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Neil Roberts [Wed, 21 Mar 2018 19:34:37 +0000 (20:34 +0100)]
nir/builder: Add a nir_imm_floatN_t helper
This lets you easily build float immediates just given the bit size.
If we have this single place here to handle this then it will be
easier to add support for 16-bit floats later.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>