Jason Ekstrand [Thu, 26 May 2016 00:27:23 +0000 (17:27 -0700)]
i965: Move brw_create_nir to brw_program.c
This way it's no longer part of libi965_compiler.la since it depends on
GLSL and ARB program stuff.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Jason Ekstrand [Thu, 26 May 2016 00:26:42 +0000 (17:26 -0700)]
i965/nir: Move the type_size_*_bytes functions to brw_nir.h
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Jason Ekstrand [Thu, 26 May 2016 00:27:57 +0000 (17:27 -0700)]
ptn: Include nir.h
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Jason Ekstrand [Wed, 25 May 2016 23:00:38 +0000 (16:00 -0700)]
compiler: Move glsl_to_nir to libglsl.la
Right now libglsl.la depends on libnir.la so putting it in libnir.la
adds a dependency on libglsl.la that goes the wrong direction.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Ben Widawsky [Thu, 26 May 2016 18:04:07 +0000 (11:04 -0700)]
i965/sklgt4: Implement depth/timestamp write w/a
The stated bug describes a scenario in which a post sync write operation for
depth or timestamp can be ignored. There are two workarounds suggested, the
first and easier is to simply do a cs stall when we do these type of writes.
The second option is to do a PIPE_CONTROL flush after the post sync but before
the data is required.
Generally, I believe the data written out is consumed by the application on the
CPU side and so doing the easier of the two is ideal. Furthermore, these queries
aren't tremendously common in the perf sensitive apps I have looked at. However,
there could be cases where a shader stage might directly consume the data, and
as a result option 2 may be desirable.
This patch goes with the easier solution for now.
gen9lp bug_de_id=
2137196
By itself, this does *not* fix any of the GT4 hangs we're currently
experiencing.
Cc: Mika Kuoppala <mika.kuoppala@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Ben Widawsky [Thu, 26 May 2016 15:08:29 +0000 (08:08 -0700)]
i965/bxt: Add 2x6 variant
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Bas Nieuwenhuizen [Tue, 12 Apr 2016 18:28:46 +0000 (20:28 +0200)]
radeonsi: Allow TES distribution between shader engines.
The R_028B50_VGT_TESS_DISTRIBUTION value is copied from
amdgpu-pro. Smaller values in the ACCUM fields seem to
decrease the performance advantage from this patch, higher
values don't seem to matter.
v2: Add distribution mode field enums.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bas Nieuwenhuizen [Mon, 2 May 2016 13:00:21 +0000 (15:00 +0200)]
radeonsi: Process multiple patches per threadgroup.
Using more than 1 wave per threadgroup does increase performance
generally. Not using too many patches per threadgroup also
increases performance. Both catalyst and amdgpu-pro seem to
use 40 patches as their maximum, but I haven't really seen
any performance increase from limiting the number of patches
to 40 instead of 64.
Note that the trick where we overlap the input and output LDS
does not work anymore as the insertion of the tess factors
changes the patch stride.
v2: - Add comment about LDS assumptions.
- Add constant for buffer size.
- Fix code style.
v3: - Correct limits for not splitting patches between waves.
- Set max num_patches to 40 as in the proprietary driver.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bas Nieuwenhuizen [Thu, 26 May 2016 12:09:43 +0000 (14:09 +0200)]
radeonsi: Add barrier before writing the tess factors.
The factors may be stored to LDs by another invocation than
the invocation for vertex 0.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bas Nieuwenhuizen [Mon, 2 May 2016 12:59:43 +0000 (14:59 +0200)]
radeonsi: Enable dynamic HS.
This allows running the TES on different CU's than the
TCS which results in performance improvements.
v2: Only write the control word from one invocation.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bas Nieuwenhuizen [Mon, 9 May 2016 23:05:32 +0000 (01:05 +0200)]
radeonsi: Remove LDS layout user SGPR's from TES.
They are unused.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bas Nieuwenhuizen [Mon, 2 May 2016 12:55:52 +0000 (14:55 +0200)]
radeonsi: Use buffer loads and stores for passing data from TCS to TES.
We always try to use 4-component loads, as LLVM does not combine loads
and they bypass the L1 cache.
We can't use a similar strategy for stores and this is especially
notable with the tess factors, as they are often set with separate
MOV's per component in the TGSI.
We keep storing to LDS and the LDS space, so we can load the outputs
later, either due to the shader, of for wrting the tess factors.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bas Nieuwenhuizen [Tue, 3 May 2016 19:31:00 +0000 (21:31 +0200)]
radeonsi: Store inputs to memory when not using a TCS.
We need to copy the VS outputs to memory. I decided to do this
using a shader key, as the value depends on other shaders.
I also switch the fixed function TCS over to monolithic, as
otherwisze many of the user SGPR's need to be passed to the
epilog, which increases register pressure, or complexity to
avoid that. The main body of the fixed function TCS is not
that interesting to precompile anyway, since we do it on
demand and it is very small.
v2: Use u_bit_scan64.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bas Nieuwenhuizen [Mon, 9 May 2016 22:49:39 +0000 (00:49 +0200)]
radeonsi: Add offchip buffer address calculation.
Instead of creating a memory area per patch and per vertex, we put
the same attribute of every vertex & patch together. Most loads
and stores access the same attribute across all lanes, only for
different patches and vertices.
For the TCS this results in tightly packed data for 4-component
stores.
For the TES this is not the case as within a patch the loads
often also access the same vertex. However if there are < 4
vertices/patch, this still results in a reduction of the number
of cache lines. In the LDS situation we only do better than worst
case if the data per patch < 64 bytes, which due to the
tessellation factors is pretty much never.
We do not use hardware swizzling for this. It would slightly reduce
the number of executed VALU instructions, but I had issues with
increased wait times that I haven't been able to solve yet.
Furthermore, the tbuffer_store intrinsic does not support both
VGPR offset and an index, so we have a problem storing
indirectly indexed outputs. This can be solved by temporarily
storing arrays in LDS and then copying them, but I don't think
that is worth the effort. The difference in VALU cycles
hardware swizzling gives is about 0.2% of total busy cycles.
That is without handling the array case.
I chose for attributes instead of components as they are often
accessed together, and the software swizzling takes VALU cycles
for calculating offsets.
v2: - Rename functions to get_tcs_tes_buffer_address.
- multiply by 16 as late as possible.
- Use tgsi_full_src_register_from_dst.
- Remove some bad comments.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bas Nieuwenhuizen [Mon, 9 May 2016 22:48:55 +0000 (00:48 +0200)]
radeonsi: Add user SGPR for the layout of the offchip buffer.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bas Nieuwenhuizen [Sun, 1 May 2016 18:35:40 +0000 (20:35 +0200)]
radeonsi: Use correct parameter index for LS_OUT_LAYOUT.
This happens to be in the right position, but that changes
when TCS/TES get new parameters.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bas Nieuwenhuizen [Mon, 2 May 2016 12:39:56 +0000 (14:39 +0200)]
radeonsi: Add buffer load functions.
v2: - Use llvm.admgcn.buffer.load instrinsics for new LLVM.
- Code style fixes.
v3: - Code style fix.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bas Nieuwenhuizen [Mon, 2 May 2016 12:20:19 +0000 (14:20 +0200)]
radeonsi: Define build_tbuffer_store_dwords earlier to support new users.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bas Nieuwenhuizen [Mon, 2 May 2016 11:20:43 +0000 (13:20 +0200)]
radeonsi: Add offchip tessellation parameters.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bas Nieuwenhuizen [Mon, 2 May 2016 07:54:11 +0000 (09:54 +0200)]
radeonsi: Add buffer for offchip storage between TCS and TES.
The buffer is quite large, but should only be allocated if the
application uses tessellation. Most non-games don't.
v2: - Use the correct register for SI.
- Add define for block size.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Rob Clark [Thu, 26 May 2016 15:11:32 +0000 (11:11 -0400)]
tgsi: fix coverity out-of-bounds warning
CID
1271532 (#1 of 1): Out-of-bounds read (OVERRUN)34. overrun-local:
Overrunning array of 2 16-byte elements at element index 2 (byte offset
32) by dereferencing pointer &inst.Dst[i].
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Rob Clark [Thu, 26 May 2016 14:22:33 +0000 (10:22 -0400)]
tgsi: fix out of bounds access
Not sure why coverity calls this an out-of-bounds read vs out-of-bounds
write.
CID
1358920 (#1 of 1): Out-of-bounds read (OVERRUN)9. overrun-local:
Overrunning array r of 3 16-byte elements at element index 3 (byte
offset 48) using index chan (which evaluates to 3).
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Anuj Phogat [Wed, 25 May 2016 18:33:51 +0000 (11:33 -0700)]
i965: Don't use fast copy blit in case of logical operations other than GL_COPY
XY_FAST_COPY_BLT command doesn't have a field for raster operation. So, fall
back to using XY_SRC_COPY_BLT to handle those cases.
Fixes piglit test gl-1.1-xor-copypixels when fast copy blit is enabled
for all tiling formats.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Anuj Phogat [Sat, 12 Dec 2015 03:14:24 +0000 (19:14 -0800)]
i965/gen9: Remove the halign/valign field setup code in fast copy blit
Experimentation with different values of src/dst horizontal/vertical
alignment showed that these fileds are not used on gen9 hardware.
A recent update in graphics specs has removed these fields from
XY_FAST_COPY_BLT command.
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Chad Versace <chad.versace@intel.com>
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Samuel Pitoiset [Wed, 25 May 2016 21:36:48 +0000 (23:36 +0200)]
nvc0: allow to monitor MP perf counters with compute shaders
To read out MP perf counters we use a compute shader and need to upload
input data like a 64-bits addr used to store the values and a sequence
ID for synchronization. Currently, this input data is uploaded as user
uniforms which means that it's sticked to c0[], but if a compute shader
from a real application is used, monitoring those performance counters
will just overwrite some data and miserably crash.
Instead, sticking the 64-bits addr and the sequence into the driver
constant buffer seems like much better and will allow to monitor
counters with GL 4.3 apps.
Tested on GF119 and GK110, but should not hurt anything on GK104.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Kristian Høgsberg Kristensen [Wed, 25 May 2016 22:29:41 +0000 (15:29 -0700)]
mesa: Move robustness code to main/robustness.c
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
Kristian Høgsberg Kristensen [Wed, 25 May 2016 22:22:52 +0000 (15:22 -0700)]
docs: Mark GL_KHR_robustness done for GLES3.2 as well
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Brian Paul <brianp@vmware.com>
Plamena Manolova [Wed, 25 May 2016 16:29:55 +0000 (17:29 +0100)]
egl: Additional attribute validation for eglCreatePbufferSurface
eglCreatePbufferSurface should generate an EGL_BAD_MATCH error if:
1: The EGL_TEXTURE_FORMAT attribute is EGL_NO_TEXTURE and EGL_TEXTURE_TARGET
is something other than EGL_NO_TEXTURE
2: EGL_TEXTURE_FORMAT is something other than EGL_NO_TEXTURE and
EGL_TEXTURE_TARGET is EGL_NO_TEXTURE.
This fixes the dEQP-EGL.functional.negative_api.create_pbuffer_surface test.
Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Marek Olšák [Tue, 24 May 2016 23:00:53 +0000 (01:00 +0200)]
gallium/radeon: add the kernel version into the renderer string
Example:
Gallium 0.4 on AMD TONGA (DRM 3.2.0 / 4.5.0, LLVM 3.9.0)
My kernel version is pretty long already (
4.5.0-amd-01025-g32791c1)
and adding "kernel" into the string would make too it long for glxinfo
to display.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Marek Olšák [Tue, 8 Mar 2016 00:19:31 +0000 (01:19 +0100)]
winsys/amdgpu: add back multithreaded command submission
Ported from the initial amdgpu winsys from the private AMD branch.
The thread creates the buffer list, submits IBs, and cleans up
the submission context, which can also destroy buffers.
3-5% reduction in CPU overhead is expected for apps submitting a lot
of IBs per frame. This is most visible with DMA IBs.
v2: use a semaphore instead of a busy loop in amdgpu_ws_queue_cs
add another amdgpu_cs_sync_flush call into amdgpu_bo_map
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Lars Hamre [Thu, 19 May 2016 21:34:00 +0000 (15:34 -0600)]
gallium/tgsi: use _mesa_roundevenf in micro_rnd
Fixes the following piglit tests (for softpipe):
/spec/glsl-1.30/execution/built-in-functions/...
fs-roundeven-float
fs-roundeven-vec2
fs-roundeven-vec3
fs-roundeven-vec4
vs-roundeven-float
vs-roundeven-vec2
vs-roundeven-vec3
vs-roundeven-vec4
/spec/glsl-1.50/execution/built-in-functions/...
gs-roundeven-float
gs-roundeven-vec2
gs-roundeven-vec3
gs-roundeven-vec4
Signed-off-by: Lars Hamre <chemecse@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Emil Velikov [Thu, 26 May 2016 12:57:32 +0000 (13:57 +0100)]
.mailmap: use Jakob Bornecrantz's personal email
The VMware one is bouncing.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Ilia Mirkin [Thu, 26 May 2016 04:02:57 +0000 (00:02 -0400)]
nvc0: add note about where the viewport mask would go
Not piping this all the way through yet, but no better place to note
this down. This will can be used with NV_viewport_array2.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sat, 21 May 2016 23:09:32 +0000 (19:09 -0400)]
nvc0: enable 32 textures on kepler+
For fermi, this likely will require use of linked tsc mode. However on
bindless architectures, we can have as many as we want. As it stands,
the AUX_TEX_INFO has 32 teture handles reserved.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Alejandro Piñeiro [Wed, 20 Apr 2016 08:02:45 +0000 (10:02 +0200)]
glsl: add unit tests data vertex/expected outcome for uninitialized warning
v2: fix 025 test. Add three more tests (Ian Romanick)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Alejandro Piñeiro [Tue, 19 Apr 2016 19:03:07 +0000 (21:03 +0200)]
glsl: add warning-test
It executes compiler-glsl on all the available shaders, and it checks
that the outcome is the expected.
Bash code based on the already existing optimization-test
v2: rebasing: use --version option
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Alejandro Piñeiro [Tue, 19 Apr 2016 18:26:32 +0000 (20:26 +0200)]
glsl: add just-log option for the standalone compiler.
Add an option in order to ask to just print the InfoLog, without any
header or separator. Useful if we want to use the standalone compiler
to track only the warning/error messages.
v2: all printfs goes on its own line (Ian Romanick)
v3: rebasing: move just_log to standalone.h/cpp
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Alejandro Piñeiro [Tue, 19 Apr 2016 09:17:27 +0000 (11:17 +0200)]
glsl: do not raise uninitialized warning with out function parameters
It silence by default warnings with function parameters, as the
parameters need to be processed in order to have the actual and the
formal parameter, and the function signature. Then it raises the
warning if needed at verify_parameter_modes where other in/out/inout modes
checks are done.
v2: fix comment style, multi-line condition style, simplify check,
remove extra blank (Ian Romanick)
v3: inout function parameters can raise the warning too (Ian
Romanick)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Alejandro Piñeiro [Tue, 19 Apr 2016 09:15:54 +0000 (11:15 +0200)]
glsl: add a empty set_is_lhs on ast_node
Just to allow to call set_is_lhs on any ast_node without a casting. Useful
when processing a ast_node list that we know it contain ast_expression.
v2: comment out new_value to avoid unused parameter warning (Ian Romanick)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Dave Airlie [Wed, 25 May 2016 03:31:41 +0000 (13:31 +1000)]
glsl: handle implicit sized arrays in ssbo
The current code disallows unsized arrays except at the end of
an SSBO but it is a bit overzealous in doing so.
struct a {
int b[];
int f[4];
};
is valid as long as b is implicitly sized within the shader,
i.e. it is accessed only by integer indices.
I've submitted some piglit tests to test for this.
This also has no regressions on piglit on my Haswell.
This fixes:
GL45-CTS.shader_storage_buffer_object.basic-syntax
GL45-CTS.shader_storage_buffer_object.basic-syntaxSSO
This patch moves a chunk of the linker code down, so
that we don't link the uniform blocks until after we've
merged all the variables. The logic went something like:
Removing the checks for last ssbo member unsized from
the compiler and into the linker, meant doing the check
in the link_uniform_blocks code. However to do that the
array sizing had to happen first, so we knew that the
only unsized arrays were in the last block. But array
sizing required the variable to be merged, otherwise
you'd get two different array sizes in different
version of two variables, and one would get lost
when merged. So the solution was to move array sizing
up, after variable merging, but before uniform block
visiting.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 25 May 2016 21:42:16 +0000 (07:42 +1000)]
glsl: fix error message on uniform block mismatch
This looks like a cut-paste from above.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 25 May 2016 23:23:54 +0000 (09:23 +1000)]
glsl/ast: assign explicit_xfb_buffer from correct place
This fixes:
GL44-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation.data_pass_through
As the OUT_TC interface structures weren't matching because
one of them had explicit_xfb_buffer set when it shouldn't.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Bruce Cherniak [Tue, 24 May 2016 20:00:17 +0000 (15:00 -0500)]
swr: [rasterizer] Correctly select optimized primitive assembly.
Indexed primitives were always using cut-aware primitive assembly,
whether primitive_restart was enabled or not. Correctly pass down
primitive_restart and select optimized PA when possible.
Reviewed-by: Tim Rowley <timothy.o.rowley@intel.com>
Kenneth Graunke [Wed, 25 May 2016 21:22:56 +0000 (14:22 -0700)]
docs: Mention i965/gen8+ supports GL 4.2 in release notes.
Kenneth Graunke [Wed, 25 May 2016 21:22:30 +0000 (14:22 -0700)]
docs: Update GL_OES_copy_image status.
Kenneth Graunke [Fri, 20 May 2016 04:44:59 +0000 (21:44 -0700)]
i965: Enable OES_copy_image (and EXT) on Gen8+ and Baytrail.
For now, only enable it on platforms that actually support ETC2.
At this point, Broadwell is only failing 5 (out of 8358) dEQP tests:
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.
srgb8_alpha8_r11f_g11f_b10f.renderbuffer_to_texture3d
srgb8_alpha8_rgb10_a2ui.renderbuffer_to_cubemap
srgb8_alpha8_rgb10_a2ui.renderbuffer_to_renderbuffer
srgb8_alpha8_rgb10_a2.renderbuffer_to_texture2d
srgb8_alpha8_rgb9_e5.renderbuffer_to_texture3d
These fail with all methods (meta, blorp, blitter, memcpy).
All are blacklisted from the Android mustpass list, which makes me
wonder whether there's an issue with the tests. The formats in
question work with other targets, and the targets in question work
with other formats...
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Kenneth Graunke [Fri, 20 May 2016 04:13:29 +0000 (21:13 -0700)]
i965: Implement a BLORP path for CopyImage and prefer it over Meta.
We're dropping Meta in favor of BLORP everywhere we can.
This also fixes bugs when copying cubemaps to 2D, which is currently
broken in the meta pass. BLORP just works.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94198
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Kenneth Graunke [Fri, 20 May 2016 04:10:14 +0000 (21:10 -0700)]
i965: Make the CopyImage BLT path bail for stencil images.
The BLT can't handle S8 because it's W-tiled (at least without
additional funny business, and I'm not sure we care). Disallow
it so it falls back to the CPU path, which works.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Kenneth Graunke [Fri, 20 May 2016 03:50:06 +0000 (20:50 -0700)]
i965: Also copy stencil miptree data.
The Meta path handles this, but the CPU/BLT fallbacks did not.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Kenneth Graunke [Fri, 20 May 2016 03:46:22 +0000 (20:46 -0700)]
i965: Make a helper function for CopyImage of a miptree.
Currently, it only contains the BLT/CPU fallbacks, so the name is a bit
too generic. But eventually this will use BLORP as well, at which point
the name will make more sense.
The next patch will introduce a second call.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Kenneth Graunke [Fri, 20 May 2016 03:29:04 +0000 (20:29 -0700)]
i965: Combine src/dest tex vs. rb checks in intel_copy_image_sub_data.
This simplifies things a little - now we only have one (tex or rb?)
if-ladder for src, and a second for dst, rather than four.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Kenneth Graunke [Fri, 20 May 2016 02:20:12 +0000 (19:20 -0700)]
i965: Account for MinLayer in CopyImageSubData's blitter/CPU paths.
Fixes Piglit's arb_copy_image-texview test with the Meta path disabled
(so we hit the blitter/CPU fallback paths).
v2: Add MinLayer even for cube maps (suggested by Ilia).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Chris Forbes <chrisforbes@google.com>
Rob Clark [Sat, 14 May 2016 17:38:13 +0000 (13:38 -0400)]
freedreno/ir3: cmdline compiler for glsl
Use glsl/libstandalone.la to add support for taking glsl src files (in
addition to .tgsi) as input. Then glsl->nir and feed the result into
the ir3 backend as normal.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Rob Clark [Sat, 14 May 2016 15:59:26 +0000 (11:59 -0400)]
glsl: split out libstandalone
Split standalone glsl_compiler into a libstandalone.la and a thin
main.cpp. This way drivers can re-use the glsl standalone frontend in
their own standalone compilers.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Rob Clark [Wed, 25 May 2016 13:59:02 +0000 (09:59 -0400)]
android: drop build of standalone glsl_compiler
It's only a tool for debugging the glsl compiler, and should not be
installed.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Tested-by: Rob Herring <robh@kernel.org>
Acked-by: Emil Velikov <emil.l.velikov@gmail.com>
Matt Turner [Tue, 24 May 2016 19:23:00 +0000 (12:23 -0700)]
i965: Mark fallthrough in switch statement.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Matt Turner [Tue, 24 May 2016 20:06:36 +0000 (13:06 -0700)]
i965: Assert that a depth_mt exists when using HiZ.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Matt Turner [Tue, 24 May 2016 19:24:56 +0000 (12:24 -0700)]
nir: Strengthen assertion that 'out' is nonnull.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Matt Turner [Tue, 24 May 2016 19:29:30 +0000 (12:29 -0700)]
spirv: Mark default cases unreachable().
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Matt Turner [Tue, 24 May 2016 20:13:04 +0000 (13:13 -0700)]
isl: Mark default cases unreachable.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Matt Turner [Tue, 24 May 2016 19:29:51 +0000 (12:29 -0700)]
isl: Remove useless qualifier from return type.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Samuel Pitoiset [Wed, 25 May 2016 09:22:44 +0000 (11:22 +0200)]
nvc0: add descriptions for hardware perf counters/metrics
The GALLIUM_HUD does not yet expose a description for each events, but
this might be useful for developers who want to have a long description
of hw perf counters directly in the source code.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Brian Paul [Wed, 25 May 2016 17:49:31 +0000 (11:49 -0600)]
mesa: 80-column wrapping for _context_lost_GetSynciv()
Reviewed-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Brian Paul [Wed, 25 May 2016 17:45:13 +0000 (11:45 -0600)]
mesa: add GLAPIENTRY to new _context_lost_X functions
To fix MSVC build. Any function which goes into the dispatch table
needs to have the GLAPIENTRY (__stdcall) tag.
Reviewed-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Giuseppe Bilotta [Wed, 25 May 2016 13:32:08 +0000 (07:32 -0600)]
scons: support 2.5.0
The get_implicit_deps changed in SCons 2.5, expecting a callable rather
than a path as third argument. Detect the SCons versions and set the
argument appropriately to support both 2.5 and earlier versions.
This closes #95211.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95211
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Giuseppe Bilotta [Wed, 25 May 2016 13:30:06 +0000 (07:30 -0600)]
scons: whitespace cleanup
This text transformation was done automatically via the following shell
command:
$ find -name SCons\* -exec sed -i s/\\s\\+$// '{}' \;
Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Alejandro Piñeiro [Tue, 24 May 2016 13:00:30 +0000 (15:00 +0200)]
i965/fs: take into account doubles when emitting system values
Fixes the following cts test:
GL42-CTS.vertex_attrib_64bit.limits_test
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Kristian Høgsberg Kristensen [Wed, 25 May 2016 16:30:26 +0000 (09:30 -0700)]
i965: Fix shadowing of 'height' parameter
The nested declaration of 'height' shadows a parameter and uses
uninitialized memory. Fix by renaming to 'plane_height' which also makes
the code clearer.
This would typically break the bo size computation, but we don't use
that except when mmaping, and we don't mmap YUV buffers much.
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Reported-by: Mathias Fröhlich <Mathias.Froehlich@gmx.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Kristian Høgsberg Kristensen [Wed, 25 May 2016 04:07:10 +0000 (21:07 -0700)]
mesa: Add .gitignore entries for make check binaries
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Acked-by: Matt Turner <mattst88@gmail.com>
Kristian Høgsberg Kristensen [Tue, 24 May 2016 05:49:51 +0000 (22:49 -0700)]
i965: Enable GL_KHR_robustness
GL_KHR_robustness adds the GL_CONTEXT_LOST error and five new entry
points that we already implement. This patch adds a new dispatch table
that returns GL_CONTEXT_LOST from all entry points and implements the
GL_LOSE_CONTEXT_ON_RESET strategy by setting that table when we learn
that we've lost the context.
With the GL_CONTEXT_LOST reporting in place and dispatch for the new
entry points we can turn on GL_KHR_robustness.
Signed-off-by: Kristian Høgsberg Kristensen <krh@bitplanet.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Emil Velikov [Wed, 25 May 2016 16:38:06 +0000 (17:38 +0100)]
.mailmap: Use Chia-I Wu personal e-mail.
The LunarG one is bouncing.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Emil Velikov [Wed, 25 May 2016 16:28:22 +0000 (17:28 +0100)]
.mailmap: Use my (Emil Velikov) personal e-mail.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Ilia Mirkin [Wed, 25 May 2016 00:03:22 +0000 (20:03 -0400)]
docs: add missing GL_OES/EXT_gpu_shader5 enablement note
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Tue, 24 May 2016 23:57:47 +0000 (19:57 -0400)]
glsl: add GL_EXT_clip_cull_distance define, add helpers
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Brian Paul [Tue, 24 May 2016 23:44:30 +0000 (17:44 -0600)]
tgsi: print TGSI_PROPERTY_NEXT_SHADER value as string, not an integer
Print "GEOM" instead of "2", for example.
v2: also update the text parsing code, per Ilia.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Brian Paul [Tue, 24 May 2016 23:44:08 +0000 (17:44 -0600)]
tgsi: s/6/PIPE_SHADER_TYPES/ for tgsi_processor_type_names array size
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Jason Ekstrand [Tue, 24 May 2016 21:17:16 +0000 (14:17 -0700)]
nir/spirv: Handle location decorations on structure members
Jason Ekstrand [Tue, 24 May 2016 20:59:10 +0000 (13:59 -0700)]
nir/spirv: Add explicit handling for all decorations
From time to time we have had cases where glslang has added a decoration we
don't handle and it has caused problems. This audit ensures that, for
every decoration, we either handle it or hit an unreachable() with an
accurate description of why we don't have to.
Jason Ekstrand [Tue, 24 May 2016 23:57:38 +0000 (16:57 -0700)]
i965/draw: Use the correct buffer index for interleaved VBO sizes
The buffer_range_* arrays are indexed by buffer index not element index.
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Jordan Justen [Tue, 24 May 2016 00:34:51 +0000 (17:34 -0700)]
i965/gen7: Fix gl_HelperInvocation
It appears that UV immediates aren't working on Ivy Bridge. In this
case, a signed version will work, and this fixes the piglit
tests/spec/glsl-4.50/execution/helper-invocation.shader_test test.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Emil Velikov [Thu, 21 Apr 2016 16:29:16 +0000 (17:29 +0100)]
mesa_glinterop: make GL interop version field bidirectional
This allows clear and easy communication between the two.
Caller: Requesting information (struct vN)
Callee: I know how to deal with older version (vN-1) only. Here is your
data and the version I support.
Caller: Older version ? Sure I'll cap all access to the fields provided
by the older version (vN-1)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Thu, 21 Apr 2016 16:16:49 +0000 (17:16 +0100)]
mesa_glinterop: drop mesa_glinterop_device_info::interop_version
One cannot use a single version to control both export_in and export_out
versions. Using this forces us to always extend/bump both structs at the
same time.
An alternative scheme is coming with next patch.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Tue, 3 May 2016 10:13:12 +0000 (11:13 +0100)]
st/dri: add note about GL interop version checks
... and make them more explicit.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Tue, 3 May 2016 10:10:54 +0000 (11:10 +0100)]
mesa_glinterop: rename MESA_GLINTEROP_INVALID_{VALUE,VERSION}
Be more explicit what it actually does.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Thu, 21 Apr 2016 15:18:39 +0000 (16:18 +0100)]
mesa_glinterop: s/struct_version/version/
OCD polish for consistency with other mesa interfaces.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Thu, 21 Apr 2016 15:16:07 +0000 (16:16 +0100)]
mesa_glinterop: fix GL interop *_VERSION comments
Using the macro to set the version is wrong and ill-advised. Please don't
do it.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Tue, 3 May 2016 11:25:53 +0000 (12:25 +0100)]
mesa_glinterop: remove inclusion of EGL header
Analogous to previous commit, but for EGL.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Tue, 3 May 2016 11:25:34 +0000 (12:25 +0100)]
mesa_glinterop: remove inclusion of GLX header
Since we only need partial information about the GLX symbols we can
forward declare them and drop the include. Obviously each user of the
said API will needs more than what's provides, so they'll include the
GLX header.
If they don't, the compiler will give us a nice warning ;-)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Tue, 3 May 2016 11:14:26 +0000 (12:14 +0100)]
mesa_glinterop: remove unneeded GLAPI/GLAPIENTRY/APIENTRYP symbols
These come from windows.h, gl.h, glcorearb.h and/or glext.h.
The interop interface is aimed at non-Windows platforms while the macros
are used/derived due to Windows specifics. Thus we can safely remove
them.
Strictly speaking there should be GLXAPIENTRY/EGLAPIENTRY and alike
macros, although a) there is no GLX ones and b) this brings us even
further from decoupling the file from the GLX/EGL header dependency.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Tue, 3 May 2016 11:13:43 +0000 (12:13 +0100)]
mesa_glinterop: replace GL types with their native counterpart.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Thu, 21 Apr 2016 15:36:01 +0000 (16:36 +0100)]
mesa_glinterop: use generic variable types for the GL interop
Thus we can preserve the ABI, while avoiding the inclusion of some/all
of the following:
EGL/egl.h
GL/gl.h
GL/glcorearb.h
GLES/gl.h
GLES2/gl2.h
GLES3/gl3.h
GLES3/gl31.h
This will allow us to build/use it alongside any combination of APIs.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Thu, 21 Apr 2016 15:20:45 +0000 (16:20 +0100)]
mesa_glinterop: use consistent naming scheme for GL interop
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Emil Velikov [Tue, 24 May 2016 13:21:31 +0000 (14:21 +0100)]
Revert "mesa: Build EGL without X11 headers after interop patchset"
This reverts commit
4e2c9a04354b6b133845b8b93c0c5d34261a91d0.
The solution was incomplete and fragile. An alternative one is coming
shortly.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Tom Stellard <thomas.stellard@amd.com>
Ian Romanick [Tue, 24 May 2016 19:43:18 +0000 (12:43 -0700)]
docs: Note that GL_OES_geometry_shader and GL_OES_tessellation_shader are started
The GL_OES_geometry_shader work is on the oes_shader_io_blocks branch
of idr's fd.o repository.
The GL_OES_tessellation_shader work is on the tess-gles branch
of kwg's fd.o repository.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Emil Velikov [Tue, 24 May 2016 15:23:09 +0000 (16:23 +0100)]
c11/threads: resolve link issues with -O0
Add weak symbol notation for the pthread_mutexattr* symbols, thus making
the linker happy. When building with -O1 or greater the optimiser will
kick in and remove the said functions as they are dead/unreachable code.
Ideally we'll enable the optimisations locally, yet that does not seem
to work atm.
v2: Add the AX_GCC_FUNC_ATTRIBUTE([weak]) hunk in configure.
Cc: Alejandro Piñeiro <apinheiro@igalia.com>
Cc: Ben Widawsky <ben@bwidawsk.net>
Cc: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: Rob Herring <robh@kernel.org>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Tested-by: Mark Janes <mark.a.janes@intel.com>
Tim Rowley [Fri, 20 May 2016 00:08:53 +0000 (18:08 -0600)]
swr: [rasterizer] remove containers.hpp
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Thu, 19 May 2016 22:23:07 +0000 (16:23 -0600)]
swr: [rasterizer core] remove utility dead code
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Tue, 17 May 2016 23:26:27 +0000 (17:26 -0600)]
swr: [rasterizer core] buckets fixes
1. Don't clear bucket descriptions to fix issues with sim level
buckets getting out of sync.
2. Close out threadviz file descriptors in ClearThreads().
3. Skip buckets for jitter based buckets when multithreaded. We need
thread local storage through llvm jit functions to be fixed before
we can enable this.
4. Fix buckets StopCapture to correctly detect capture complete.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Tue, 17 May 2016 22:32:08 +0000 (16:32 -0600)]
swr: [rasterizer core] move centroid setup out of CalcCentroidBarycentrics
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Tim Rowley [Fri, 20 May 2016 16:15:43 +0000 (11:15 -0500)]
swr: [rasterizer jitter] implement InstanceID/VertexID in fetch jit
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>