Jason Ekstrand [Fri, 31 Mar 2017 22:21:04 +0000 (15:21 -0700)]
anv/blorp: Align vertex buffers to 64B
This fixes issues seen when adding support for full 48-bit addresses.
The 48-bit addresses themselves have nothing to do with it other than
that it caused the kernel to place buffers slightly differently so they
interacted differently with the caches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Jason Ekstrand [Mon, 27 Mar 2017 23:03:57 +0000 (16:03 -0700)]
anv: Query the kernel for reset status
When a client causes a GPU hang (or experiences issues due to a hang in
another client) we want to let it know as soon as possible. In
particular, if it submits work with a fence and calls vkWaitForFences or
vkQueueQaitIdle and it returns VK_SUCCESS, then the client should be
able to trust the results of that rendering. In order to provide this
guarantee, we have to ask the kernel for context status in a few key
locations.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 27 Mar 2017 23:01:42 +0000 (16:01 -0700)]
anv: Check for device loss at the end of WaitForFences
It's possible that the device could have been lost while we were
waiting. We should let the user know if this has happened.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jason Ekstrand [Mon, 3 Apr 2017 19:25:15 +0000 (12:25 -0700)]
anv/pipeline: Properly handle unset gl_Layer and gl_ViewportIndex
When the shader does not set one of these values, they are supposed to
get a default value of 0. We have hardware bits in 3DSTATE_CLIP for
this but haven't been setting them. This fixes the intermittent failure
of dEQP-VK.geometry.layered.3d.render_to_default_layer.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Jason Ekstrand [Wed, 29 Mar 2017 22:16:15 +0000 (15:16 -0700)]
i965/fs: Always provide a default LOD of 0 for TXS and TXL
We already provide a default LOD for textureQueryLevels and texture() on
non-fragment stages. However, there are more cases where one is needed
such as textureSize(gsampler2DMS*) in SPIR-V. Instead of trying to list
out all of the cases one at a time, just provide the default for all TXS
and TXL operations. This fixes a shader validation error in the new
Sascha deferredmultisampling demo which uses textureSize(gsampler2DMS).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100391
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Kenneth Graunke [Thu, 23 Feb 2017 23:04:52 +0000 (15:04 -0800)]
mesa: Require mipmap completeness for glCopyImageSubData(), sometimes.
This patch makes glCopyImageSubData require mipmap completeness when the
texture object's built-in sampler object has a mipmapping MinFilter.
Fixes (on i965):
dEQP-GLES31.functional.debug.negative_coverage.*.buffer.copy_image_sub_data
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Vinson Lee [Tue, 4 Apr 2017 21:52:39 +0000 (14:52 -0700)]
libgl-xlib: Link with libunwind.
Fix linking error.
CXXLD libGL.la
../../../../src/gallium/auxiliary/.libs/libgallium.a(u_debug_stack.o): In function `debug_backtrace_capture':
src/gallium/auxiliary/util/u_debug_stack.c:59: undefined reference to `_Ux86_64_getcontext'
src/gallium/auxiliary/util/u_debug_stack.c:60: undefined reference to `_ULx86_64_init_local'
src/gallium/auxiliary/util/u_debug_stack.c:62: undefined reference to `_ULx86_64_step'
src/gallium/auxiliary/util/u_debug_stack.c:71: undefined reference to `_ULx86_64_get_proc_info'
src/gallium/auxiliary/util/u_debug_stack.c:73: undefined reference to `_ULx86_64_get_proc_name'
src/gallium/auxiliary/util/u_debug_stack.c:65: undefined reference to `_ULx86_64_step'
Fixes: 70c272004f72 ("gallium/util: libunwind support")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100562
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Jason Ekstrand [Tue, 4 Apr 2017 18:31:22 +0000 (11:31 -0700)]
intel/isl: Refactor and clerify gen8 alignment calculations
Adding the actual table from the docs makes it clearer exactly what the
restrictions are. In particular, it becomes clear that compressed
textures ignore the alignment parameters in RENDER_SURFACE_STATE.
Reviewed-by: Chad Versace <chadversary@chromium.org>
Francisco Jerez [Tue, 4 Apr 2017 21:12:59 +0000 (14:12 -0700)]
drirc: Set glsl_zero_init for Kerbal Space Program.
This fixes the stripes of garbage rendered on the floor of the vehicle
assembly building among other rendering issues. The reason for the
misrendering seems to be that some of the GLSL shaders used by the
application use variables before initializing them, incorrectly
assuming that they will be implicitly set to zero by the
implementation.
Acked-by: Matt Turner <mattst88@gmail.com>
Lionel Landwerlin [Wed, 1 Mar 2017 14:39:58 +0000 (14:39 +0000)]
intel: tools: add aubinator_error_decode tool
This is pretty much the same tool as what i-g-t has, only with a more
fancy decoding of the instructions/registers. It also doesn't support
anything before gen4.
v2 (from Matt): Drop authors
Remove undefined automake variable
v3: Fix incorrect offsets for dword > 1 (Jordan)
v4: Fix decompression error with large blobs (Jordan)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Lionel Landwerlin [Fri, 10 Mar 2017 17:27:01 +0000 (17:27 +0000)]
intel: genxml: add RING_BUFFER_CTL registers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Lionel Landwerlin [Fri, 10 Mar 2017 14:27:53 +0000 (14:27 +0000)]
intel: genxml: add FAULT_REG register
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Lionel Landwerlin [Fri, 10 Mar 2017 14:27:23 +0000 (14:27 +0000)]
intel: genxml: add gen7 ERR_INT register
v2: add register to gen7.5 (Matt)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Lionel Landwerlin [Thu, 9 Mar 2017 15:38:43 +0000 (15:38 +0000)]
intel: genxml: add ACTHD registers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Lionel Landwerlin [Thu, 9 Mar 2017 15:38:20 +0000 (15:38 +0000)]
intel: genxml: add GFX_ARB_ERROR_RPT register
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Lionel Landwerlin [Thu, 9 Mar 2017 11:58:19 +0000 (11:58 +0000)]
intel: genxml: add INSTDONE registers
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Marek Olšák [Thu, 23 Mar 2017 23:55:55 +0000 (00:55 +0100)]
targets: export radeon winsys_create functions to silence LLVM warning
It silences the following radeonsi LLVM warning due to a previous
commit adding an LLVM workaround:
"mesa: for the -simplifycfg-sink-common option: may only occur zero or one
times!"
Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by; Emil Velikov <emil.velikov@collabora.com>
Constantine Kharlamov [Sun, 2 Apr 2017 17:33:06 +0000 (20:33 +0300)]
r600g: check rasterizer primitive states like in radeonsi
Specifically, non-line primitives skipped, and defaulting to reset on
each packet.
The skip of non-line primitives saves ≈110 resetting of
PA_SC_LINE_STIPPLE register per frame in Kane&Lynch2.
Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Constantine Kharlamov [Sun, 2 Apr 2017 17:33:05 +0000 (20:33 +0300)]
r600g: extract a code into a r600_emit_rasterizer_prim_state()
Also change gs_output_prim type: unsigned → pipe_prim_type. The idea of
the code is mostly taken from radeonsi. The new code operating on
prev/curr rast_primitives saves ≈15 reloads of PA_SC_LINE_STIPPLE per
frame in Kane&Lynch2
Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Constantine Kharlamov [Sun, 2 Apr 2017 17:33:04 +0000 (20:33 +0300)]
r600g/radeonsi: use the correct types (taken from pipe_draw_info)
Note: si_shader.h has also "type" variable that should be changed to
"enum pipe_prim_type", however it triggers a bunch of warnings about
unhandled switches, so due not knowing the correct way to handle them, I
decided to leave it as is.
Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Constantine Kharlamov [Sun, 2 Apr 2017 17:33:03 +0000 (20:33 +0300)]
r600g: remove duplicate memset by using a pointer, and constify args
Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Elie TOURNIER [Thu, 9 Mar 2017 15:16:54 +0000 (15:16 +0000)]
glsl: remove unused file
udivmod64 appears in src/compiler/glsl/builtin_int64.h and src/compiler/glsl/udivmod.h
The second file seems unused.
Fix commit
6b03b345eb64e15e577bc8b2cf04b314a4c70537
This change doesn't affect shader-db.
Signed-off-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Marek Olšák [Mon, 3 Apr 2017 09:49:59 +0000 (11:49 +0200)]
radeonsi: access gallivm through ctx in most places
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 3 Apr 2017 09:37:10 +0000 (11:37 +0200)]
radeonsi: use ctx->types instead of bld->types etc.
even vec_type is f32.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 3 Apr 2017 09:23:59 +0000 (11:23 +0200)]
radeonsi: use i32_0/1 instead of *int_bld.zero/one in most places
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 2 Apr 2017 12:42:17 +0000 (14:42 +0200)]
gallium: decrease the size of pipe_draw_info - 88 -> 80 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sun, 2 Apr 2017 00:36:16 +0000 (02:36 +0200)]
gallium: decrease the size of pipe_vertex_element - 16 -> 8 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sun, 2 Apr 2017 00:13:12 +0000 (02:13 +0200)]
gallium: decrease the size of pipe_resource - 64 -> 48 bytes
Some other changes needed here.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sun, 2 Apr 2017 00:00:49 +0000 (02:00 +0200)]
gallium: decrease the size of pipe_box - 24 -> 16 bytes
Also:
pipe_transfer: 48 -> 40 bytes.
pipe_blit_info = 176 -> 160 bytes.
v2: add a comment at pipe_box
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sat, 1 Apr 2017 23:51:57 +0000 (01:51 +0200)]
gallium: decrease the size of pipe_sampler_view - 48 -> 32 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sat, 1 Apr 2017 23:46:11 +0000 (01:46 +0200)]
gallium: decrease the size of pipe_surface - 48 -> 40 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sat, 1 Apr 2017 23:27:13 +0000 (01:27 +0200)]
gallium: decrease the size of pipe_framebuffer_state - 96 -> 80 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sat, 1 Apr 2017 23:24:47 +0000 (01:24 +0200)]
gallium: decrease the size of pipe_stream_output_info - 532 -> 268 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Sat, 1 Apr 2017 23:10:36 +0000 (01:10 +0200)]
gallium: decrease the size of pipe_rasterizer_state - 36 -> 32 bytes
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Marek Olšák [Mon, 27 Feb 2017 21:25:44 +0000 (22:25 +0100)]
amd/addrlib: second update for Vega10 + bug fixes
Highlights:
- Display needs tiled pitch alignment to be at least 32 pixels
- Implement Addr2ComputeDccAddrFromCoord().
- Macro-pixel packed formats don't support Z swizzle modes
- Pad pitch and base alignment of PRT + TEX1D to 64KB.
- Fix support for multimedia formats
- Fix a case "PRT" entries are not selected on SI.
- Fix wrong upper bits in equations for 3D resource.
- We can't support 2d array slice rotation in gfx8 swizzle pattern
- Set base alignment for PRT + non-xor swizzle mode resource to 64KB.
- Bug workaround for Z16 4x/8x and Z32 2x/4x/8x MSAA depth texture
- Add stereo support
- Optimize swizzle mode selection
- Report pitch and height in pixels for each mip
- Adjust bpp/expandX for format ADDR_FMT_GB_GR/ADDR_FMT_BG_RG
- Correct tcCompatible flag output for mipmap surface
- Other fixes and cleanups
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 2 Apr 2017 23:44:32 +0000 (01:44 +0200)]
radeonsi: use i32_0 and i32_1 more
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 2 Apr 2017 23:41:24 +0000 (01:41 +0200)]
radeonsi: remove most uses of lp_build_const*
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 2 Apr 2017 23:25:02 +0000 (01:25 +0200)]
radeonsi: clean up 'radeon_bld' references
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 2 Apr 2017 22:22:16 +0000 (00:22 +0200)]
radeonsi: fix broken texture filtering on SI-CIK since GFX9 changes
Don't clear state[7] on SI-CIK, and only do the meta stuff on VI+.
Fixes: 5abf60076ce4 ("radeonsi/gfx9: image descriptor changes in mutable fields")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100531
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Juan A. Suarez Romero [Mon, 3 Apr 2017 16:48:33 +0000 (18:48 +0200)]
bin/get-fixes-pick-list.sh: fix typo
Replace "nore" by "more".
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Mauro Rossi [Sat, 1 Apr 2017 10:50:33 +0000 (12:50 +0200)]
android: intel: genxml: fix genX_xml.h generation rules
Recent changes in Makefile.sources merged the aubinator files in
a unique list of generated files and genxml/genX_xml.h is now needed
to avoid the following building error:
ninja: error: '.../genxml/genX_xml.h', needed by '.../genxml/genX_xml.h',
missing and no known rule to make it
build/core/ninja.mk:148: recipe for target 'ninja_wrapper' failed
Fixes: 0f83c05 "intel: genxml: compress all gen files into one"
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 3 Apr 2017 23:24:47 +0000 (16:24 -0700)]
intel/vec4: Add some fall through comments
Reviewed-by: Matt Turner <mattst88@gmail.com>
Bartosz Tomczyk [Mon, 3 Apr 2017 19:19:40 +0000 (21:19 +0200)]
mesa/glthread: Avoid unnecessary batch reallocation
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bas Nieuwenhuizen [Mon, 3 Apr 2017 17:40:06 +0000 (19:40 +0200)]
radv: Increase descriptor limits.
We supported more generally. Decreased the dynamic buffers though, as
we only support 16 for uniform+storage.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Bartosz Tomczyk [Mon, 3 Apr 2017 19:12:54 +0000 (21:12 +0200)]
mesa/glthread: fix misaligned address access
Address sanitizer reports lot of misaligned access:
SUMMARY: AddressSanitizer: undefined-behavior main/marshal.c:276:31 in
main/marshal.c:276:31: runtime error: load of misaligned address 0x631000104866 for type
'const GLuint' (aka 'const unsigned int'), which requires 4 byte alignment
0x631000104866: note: pointer points here
92 88 00 00 00 00 00 00 4a 03 0c 00 93 88 00 00 00 00 00 00 02 01 0c 00 40 8d 00 00 00 00 00 00
^
SUMMARY: AddressSanitizer: undefined-behavior main/marshal_generated.c:28725:12 in
main/marshal_generated.c:28726:12: runtime error: member access within misaligned address 0x6310003fc874 for type
'struct marshal_cmd_VertexAttribPointer', which requires 8 byte alignment
0x6310003fc874: note: pointer points here
01 00 00 00 7a 02 20 00 00 00 00 00 be be be be be be be be be be be be be be be be be be be be
^
SUMMARY: AddressSanitizer: undefined-behavior main/marshal_generated.c:28726:12 in
main/marshal_generated.c:28726:12: runtime error: store to misaligned address 0x6310003fc87c for type
'GLint' (aka 'int'), which requires 8 byte alignment
0x6310003fc87c: note: pointer points here
00 00 00 00 be be be be be be be be be be be be be be be be be be be be be be be be be be be be
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bartosz Tomczyk [Mon, 3 Apr 2017 17:39:19 +0000 (19:39 +0200)]
glsl: Fix blob memory leak
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Bas Nieuwenhuizen [Sun, 2 Apr 2017 10:32:39 +0000 (12:32 +0200)]
radv: Rework guard band calculation.
We want the guardband_x/y to be the largerst scalars such that each
viewport scaled by that amount is still a subrange of [-32767, 32767].
The old code has a couple of issues:
1) It used scissor instead of viewport_scissor, potentially taking into
account a viewport that is too small and therefore selecting a scale
that is too large.
2) Merging the viewports isn't ideal, as for example viewports with
boundaries [0,1] and [1000, 1001] would allow a guardband scale of ~30k,
while their union [0, 1001] only allows a scale of ~32.
The new code just determines the guardband per viewport and takes the minimum.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Bas Nieuwenhuizen [Mon, 3 Apr 2017 19:33:51 +0000 (21:33 +0200)]
radv: Enable VK_KHR_incremental_present.
Just enabling the driver-independent implementation that Jason did.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Jason Ekstrand [Tue, 24 Jan 2017 23:13:31 +0000 (15:13 -0800)]
anv: Implement VK_KHR_incremental_present
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Jason Ekstrand [Tue, 24 Jan 2017 23:29:43 +0000 (15:29 -0800)]
vulkan/wsi/wayland: Pass damage through to the compositor
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Jason Ekstrand [Tue, 24 Jan 2017 23:11:01 +0000 (15:11 -0800)]
vulkan/wsi: Plumb present regions through the common code
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Jason Ekstrand [Sat, 1 Apr 2017 05:15:31 +0000 (22:15 -0700)]
vulkan/wsi: Fix some line wrapping
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Dave Airlie [Mon, 3 Apr 2017 03:43:15 +0000 (04:43 +0100)]
radv: fix interp at sample code.
Interp at sample needs to use the center, since the sample
positions it retrieves are relative to the center.
This fixes a bunch of CTS tests with multisample_interpolation.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 3 Apr 2017 03:38:12 +0000 (04:38 +0100)]
radv: overhaul fragment shader sample positions.
The current code was broken, and I decided to redesign it instead.
This puts the sample positions for all samples into the queue
constant descriptor buffer after all the spill/ring descriptors.
It then uses a single offset register to point how far into the
samples the samples for num_samples are. This saves one user sgpr
and means we only generate the sample position data in the rare
single case where we need it currently.
This doesn't fix the failing CTS tests without the followup
fix.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Lionel Landwerlin [Sun, 2 Apr 2017 00:07:37 +0000 (01:07 +0100)]
aubinator/gen_decoder/i965: decode instructions from dword 0
Some packets like 3DSTATE_VF_STATISTICS, 3DSTATE_DRAWING_RECTANGLE,
3DPRIMITIVE, PIPELINE_SELECT, etc... have configurable fields in
dword0, we probably want to print those.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Lionel Landwerlin [Sun, 2 Apr 2017 14:54:14 +0000 (15:54 +0100)]
intel: gen_decoder: store pointer to current decoded field in iterator
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Dave Airlie [Sun, 19 Mar 2017 03:39:29 +0000 (13:39 +1000)]
radv/ac: fix texture derivative ordering
The ordering NIR gives us is correct for the hw, this fixes:
dEQP-VK.glsl.texture_functions.texturegrad.* (mainly trigged
on isampler/usampler 3d textures.).
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Wed, 15 Mar 2017 05:45:05 +0000 (15:45 +1000)]
radv/ac: round cube array coordinate before fixup.
This fixes:
dEQP-VK.glsl.texture_functions.texture.samplercubearray*
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Mon, 3 Apr 2017 18:57:14 +0000 (04:57 +1000)]
radv: move to using common buffer load format.
Get rid of usage of SI.vs.load.input.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Brian Paul [Mon, 3 Apr 2017 14:45:07 +0000 (08:45 -0600)]
util: fix MSVC warning in u_align_u32()
To silence
C:\Users\Brian\projects\mesa\src\util/u_vector.h(41) : warning C4146: unary
minus operator applied to unsigned type, result still unsigned
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Brian Paul [Mon, 3 Apr 2017 14:44:10 +0000 (08:44 -0600)]
util: #include "c99_compat.h" to fix Windows build
Otherwise, we were getting the definition for 'inline' by chance from
some other preceeding #include.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Brian Paul [Mon, 3 Apr 2017 14:36:45 +0000 (08:36 -0600)]
util: s/SHA1_H/MESA_SHA1_H/
To follow the convention of other header include guards.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Brian Paul [Mon, 3 Apr 2017 14:42:36 +0000 (08:42 -0600)]
svga: add comment on svga_buffer_hw_storage_map()
Trivial.
Rhys Kidd [Sun, 2 Apr 2017 20:48:39 +0000 (16:48 -0400)]
travis: Support LLVM 3.8+ on Trusty-based Travis-CI via apt-get not apt addon
Per comments by Travis-CI, the apt addon is only really needed for the
container-based Precise builds, as they don't yet support Trusty on that platform.
Mesa currently uses Trusty fully-virtualized environment (due to sudo: required).
See further:
https://docs.travis-ci.com/user/trusty-ci-environment/#Fully-virtualized-via-sudo%3A-required
https://github.com/travis-ci/apt-source-whitelist/pull/205#issuecomment-
216054237
Signed-off-by: Rhys Kidd <rhyskidd@gmail.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Grazvydas Ignotas [Thu, 30 Mar 2017 22:26:25 +0000 (01:26 +0300)]
util/u_atomic: provide 64bit atomics where they're missing
There are still some distributions trying to support unfortunate people
with old or exotic CPUs that don't have 64bit atomic operations. When
compiling for such a machine, gcc conveniently inserts a library call to
a helper, but it's implementation is missing and we get a linker error.
This allows us to provide our own implementation, which is marked weak
to prefer a better implementation, should one exist.
v2: changed copyright, some style adjustments
v3: [mattst88] Print results with AC_MSG_CHECKING/AC_MSG_RESULT
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93089
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Rob Clark [Fri, 24 Mar 2017 20:07:03 +0000 (16:07 -0400)]
gallium/util: libunwind support
It's kinda sad that (a) we don't have debug_backtrace support on !X86
and that (b) we re-invent our own crude backtrace support in the first
place. If available, use libunwind instead. The backtrace format is
based on what xserver and weston use, since it is nice not to have to
figure out a different format.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Rob Clark [Fri, 24 Mar 2017 19:04:58 +0000 (15:04 -0400)]
gallium/util: clean up stack frame printing
Prep work for next patch.
Ideally 'struct debug_stack_frame' would be opaque, but it is embedded
in a bunch of places. But at least we can treat it opaquely.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Samuel Pitoiset [Fri, 31 Mar 2017 10:48:03 +0000 (12:48 +0200)]
st/mesa: add st_convert_image()
Should be used by the state tracker when glGetImageHandleARB()
is called in order to create a pipe_image_view template.
v3: - move the comment to *.c
v2: - make 'st' const
- describe the function
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Thu, 30 Mar 2017 16:55:02 +0000 (18:55 +0200)]
st/mesa: make 'st' const in st_mesa_format_to_pipe_format()
This avoids a compilation warning since st_convert_image()
requires 'st' to be const.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bartosz Tomczyk [Thu, 30 Mar 2017 20:31:09 +0000 (22:31 +0200)]
mesa/glthread: Call unmarshal_batch directly in glthread_finish
Call it directly when batch queue is empty. This avoids costly thread
synchronisation. This commit improves performance of games that have
previously regressed with mesa_glthread=true.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Fri, 31 Mar 2017 00:45:34 +0000 (11:45 +1100)]
mesa: disable glthread when DEBUG_OUTPUT_SYNCHRONOUS is enabled
We could re-enable it also but I haven't tested that yet, and I'm
not sure we care much anyway.
V2: don't disable it from with the call itself. We need a custom
marshalling function or we get stuck waiting for thread to
finish.
V3: tidy up redundant code copied from generated version.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Grazvydas Ignotas [Sun, 2 Apr 2017 17:22:21 +0000 (20:22 +0300)]
amd/addrlib: fix optimized build warnings
All the -Wunused-but-set-variable ones.
Found a way to do it with a oneliner.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Grazvydas Ignotas [Sun, 2 Apr 2017 17:22:11 +0000 (20:22 +0300)]
radeonsi: use unreachable to fix a warning
si_state.c: In function ‘si_make_texture_descriptor’:
si_state.c:3240:25: warning: ‘num_format’ may be used uninitialized
si_state.c:3240:12: warning: ‘data_format’ may be used uninitialized
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Constantine Kharlamov [Sun, 26 Mar 2017 15:36:22 +0000 (18:36 +0300)]
r600g: Add more (un)likely functions
1-st is obvious because of assert, 2-nd stolen frmo si_draw_vbo(),
and 3-rd is just a small refactoring.
Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Constantine Kharlamov [Sun, 26 Mar 2017 15:36:21 +0000 (18:36 +0300)]
r600g: Remove intermediate assignment of pipe_draw_info
It removes a need to copy whole struct every call for no reason. Comparing
objdump -d output for original and this patch compiled with -O2, shows reduce
of the function by 16 bytes.
Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Constantine Kharlamov [Sun, 26 Mar 2017 15:36:20 +0000 (18:36 +0300)]
r600g: Use separate index_bias variable
Needed to get rid of a separate struct allocation in the next patch, because
the one in argument is a constant, and don't allow changing its fields.
Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Ilia Mirkin [Sun, 2 Apr 2017 14:57:39 +0000 (10:57 -0400)]
nv30: fp/rast may be null when validating fb/scissor due to clear
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sun, 2 Apr 2017 14:48:11 +0000 (10:48 -0400)]
nvc0: fragprog may not be set when e.g. clearing
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Mon, 6 Mar 2017 00:45:00 +0000 (19:45 -0500)]
nv50: don't assume a rast is set when validating for clears
Clears can happen before a rast is set, which can in turn cause scissors
and fragprog to be validated. Make sure that we handle this case.
Reported-by: Andrew Randrianasulu <randrianasulu@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Dave Airlie [Sun, 2 Apr 2017 04:36:51 +0000 (14:36 +1000)]
radv: fix order of the guardband register emission.
y is vert, x is horiz.
Noticed in visual inspection compared to radeonsi.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Edward O'Callaghan [Fri, 17 Mar 2017 05:24:06 +0000 (16:24 +1100)]
mesa/main: Fix memset in formatquery.c
v2: We explicitly set each member to -1 over using a confusing
memset().
Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Samuel Pitoiset [Thu, 30 Mar 2017 17:58:02 +0000 (19:58 +0200)]
radeonsi: add load_image_desc()
Similar to load_sampler_desc(). Same deal for bindless.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Thu, 30 Mar 2017 17:58:01 +0000 (19:58 +0200)]
radeonsi: rework the load_sampler_desc() helpers
Will be more convenient for bindless because the 64bit handle is
actually the base_ptr of the descriptor (ie. 'list' will be fetched
from TGSI_FILE_CONSTANT/TGSI_FILE_TEMPORARY instead).
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Thu, 30 Mar 2017 17:57:43 +0000 (19:57 +0200)]
gallivm: add lp_build_emit_fetch_src() helper
lp_build_emit_fetch() is useful when the source type can be
infered from the instruction opcode.
However, for bindless samplers/images we can't do that easily
because tgsi_opcode_infer_src_type() returns TGSI_TYPE_FLOAT for
TEX instructions, while we need TGSI_TYPE_UNSIGNED64 if the
resource register is bindless.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Andres Gomez [Sat, 1 Apr 2017 15:51:40 +0000 (18:51 +0300)]
docs: add news item and link release notes for 17.0.1
Signed-off-by: Andres Gomez <agomez@igalia.com>
Andres Gomez [Sat, 1 Apr 2017 15:47:00 +0000 (18:47 +0300)]
docs: add sha256 checksums for 17.0.3
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit
71d2f05a9e831af04ea26dd8c975d285e0b964ec)
Andres Gomez [Sat, 1 Apr 2017 14:29:34 +0000 (17:29 +0300)]
docs: add release notes for 17.0.3
Signed-off-by: Andres Gomez <agomez@igalia.com>
(cherry picked from commit
7f34ecae7fddd3435346f0475557b34920763422)
Erik Faye-Lund [Wed, 24 Sep 2014 11:41:25 +0000 (13:41 +0200)]
glsl: ir_explog_to_explog2 is no more
Since
63684a9a ("glsl: Combine many instruction lowering passes
into one.", Thu Nov 18 2010), we no longer have anything called
ir_explog_to_explog2. So it's only confusing to have those
references there.
Update with the appropriate method, so people can grep for it in
the current tree if they encounter it.
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Erik Faye-Lund [Wed, 21 Aug 2013 13:59:14 +0000 (15:59 +0200)]
gallium/docs: remove documentation of removed arg
geom was removed in
e968975 ("gallium: remove the geom_flags param
from is_format_supported", Tue Mar 8 00:01:58 2011 +0100), but the
documentation of it was left over. Let's bring the documentation up
to date.
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Erik Faye-Lund [Mon, 8 Aug 2016 08:11:31 +0000 (10:11 +0200)]
st/mesa: avoid aliasing violation in st_cb_perfmon.c
Signed-off-by: Erik Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Michal Srb [Tue, 28 Mar 2017 20:39:28 +0000 (23:39 +0300)]
st: Add cubeMapFace parameter to st_finalize_texture.
st_finalize_texture always accesses image at face 0, but it may not be
set if we are working with cubemap that had other face set.
This fixes crash in piglit
same-attachment-glFramebufferTexture2D-GL_DEPTH_STENCIL_ATTACHMENT.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Jason Ekstrand [Sat, 1 Apr 2017 05:16:24 +0000 (22:16 -0700)]
vulkan: Bump the header and XML to the latest public version
Karol Herbst [Sun, 26 Mar 2017 19:46:01 +0000 (21:46 +0200)]
nv50/ir: also do PostRaLoadPropagation for FMA
Helps Feral-ported games, due to their use of fma()
shader-db changes:
total instructions in shared programs :
3934925 ->
3934327 (-0.02%)
total gprs used in shared programs : 481563 -> 481563 (0.00%)
total local used in shared programs : 27469 -> 27469 (0.00%)
total bytes used in shared programs :
36061888 ->
36056504 (-0.01%)
local gpr inst bytes
helped 0 0 228 228
hurt 0 0 0 0
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Karol Herbst [Sun, 26 Mar 2017 19:46:00 +0000 (21:46 +0200)]
gm107/ir: add LIMM form of mad
v2: renamed commit
reordered modifiers
add assert(dst == src2)
v3: reordered modifiers again
v5: no rounding bit for limms
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Karol Herbst [Sun, 26 Mar 2017 19:45:59 +0000 (21:45 +0200)]
gk110/ir: add LIMM form of mad
v2: renamed commit
reordered modifiers
add assert(dst == src2)
v3: removed wrong neg mod emission
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Karol Herbst [Sun, 26 Mar 2017 19:45:58 +0000 (21:45 +0200)]
nv50/ir: implement mad post ra folding for nvc0+
changes for GpuTest /test=pixmark_piano /benchmark /no_scorebox /msaa=0
/benchmark_duration_ms=60000 /width=1024 /height=640:
score: 1026 -> 1045
changes for shader-db:
total instructions in shared programs :
3943335 ->
3934925 (-0.21%)
total gprs used in shared programs : 481563 -> 481563 (0.00%)
total local used in shared programs : 27469 -> 27469 (0.00%)
total bytes used in shared programs :
36139384 ->
36061888 (-0.21%)
local gpr inst bytes
helped 0 0 3587 3587
hurt 0 0 0 0
v2: removed TODO
reorderd to show changes without RA modification
removed stale debugging print() call
v3: remove predicate checks
enable only for gf100 ISA
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Karol Herbst [Sun, 26 Mar 2017 19:45:57 +0000 (21:45 +0200)]
nv50/ir: restructure and rename postraconstantfolding pass
we might want to add more folding passes here, so make it a bit more generic
v2: leave the comment and reword commit message
v4: rename it to PostRaLoadPropagation
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Karol Herbst [Tue, 21 Mar 2017 17:37:47 +0000 (18:37 +0100)]
nvc0/ir: also do ConstantFolding for FMA
Helps mainly Feral-ported games, due to their use of fma()
shader-db changes:
total instructions in shared programs :
3941587 ->
3940749 (-0.02%)
total gprs used in shared programs : 481511 -> 481460 (-0.01%)
total local used in shared programs : 27469 -> 27481 (0.04%)
total bytes used in shared programs :
36123344 ->
36115776 (-0.02%)
local gpr inst bytes
helped 2 48 243 243
hurt 2 3 32 32
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Karol Herbst [Tue, 21 Mar 2017 17:37:46 +0000 (18:37 +0100)]
nvc0/ir: disable support for LIMMs on MAD/FMA
I hit an assert in the emiter while toying around with optimizations, because
ConstantFolding immediated a big int into a mad.
There is special handling for FMA/MAD in insnCanLoad, which is broken. With
this patch the special path should be not hit anymore. Anyway, the constraints
for the LIMMS can't be guarenteed in SSA form and I have patches pending to
use it via a post-SSA optimization pass.
As a result, immediates get immediated for int mad/fmas as well.
changes in shader-db:
total instructions in shared programs :
3943335 ->
3941587 (-0.04%)
total gprs used in shared programs : 481563 -> 481511 (-0.01%)
total local used in shared programs : 27469 -> 27469 (0.00%)
total bytes used in shared programs :
36139384 ->
36123344 (-0.04%)
Signed-off-by: Karol Herbst <karolherbst@gmail.com>
[imirkin: remove extra bit from insnCanLoad as well]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Lyude [Wed, 15 Mar 2017 21:15:03 +0000 (17:15 -0400)]
nvc0: Add support for NV_fill_rectangle for the GM200+
This enables support for the GL_NV_fill_rectangle extension on the
GM200+ for Desktop OpenGL.
Signed-off-by: Lyude <lyude@redhat.com>
Changes since v1:
- Fix commit message
- Add note to reldocs
Changes since v2:
- Remove unnessecary parens in nvc0_screen_get_param()
- Fix sorting in release notes
- Don't execute FILL_RECTANGLE method on pre-GM200+ GPUs
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>