Brian Paul [Tue, 28 Jun 2016 23:15:57 +0000 (17:15 -0600)]
svga: use SVGA3D_vgpu10_BufferCopy() for buffer copies
So that we do copies host-side rather than in the guest with map/memcpy.
Tested with piglit arb_copy_buffer-subdata-sync test and new
arb_copy_buffer-intra-buffer-copy test.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Thu, 23 Jun 2016 02:38:06 +0000 (20:38 -0600)]
svga: add SVGA3D_vgpu10_BufferCopy()
Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Thu, 30 Jun 2016 19:27:57 +0000 (13:27 -0600)]
svga: flush buffers when mapping for reading
With host-side buffer copies (via SVGA3D_vgpu10_BufferCopy()) we have
to make sure any pending map-write operations are completed before reading
if the buffer is dirty. Otherwise the ReadbackSubResource operation could
get stale data from the host buffer.
This allows the piglit arb_copy_buffer-subdata-sync test to pass when
we start using the SVGA3D_vgpu10_BufferCopy command.
v2: check the sbuf->dirty flag in the outer conditional, per Charmaine.
Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Neha Bhende [Thu, 23 Jun 2016 17:21:31 +0000 (11:21 -0600)]
svga: enable ARB_copy_image extension in the driver
Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Tue, 28 Jun 2016 23:13:57 +0000 (17:13 -0600)]
svga: try blitting with copy region in more cases
We previously could do blits with util_resource_copy_region() when doing
'loose' format checking. Also do blits with util_resource_copy_region()
when the blit src/dst formats (not the underlying resources) exactly
match. Needed for GL_ARB_copy_image.
Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Thu, 23 Jun 2016 17:57:08 +0000 (11:57 -0600)]
svga: use copy_region_vgpu10() for region copies when possible
v2: remove extra svga_define_texture_level() call, per Charmaine.
Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Neha Bhende [Tue, 28 Jun 2016 23:20:43 +0000 (17:20 -0600)]
svga: use vgpu10 CopyRegion command when possible
Do texture->texture copies host-side with this command when possible.
Use the previous software fallback otherwise.
Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Mon, 27 Jun 2016 17:18:10 +0000 (11:18 -0600)]
svga: set render target flag for snorm surfaces
We don't normally support rendering to SNORM surfaces, but with
GL_ARB_copy_image we can copy to them if we treat them as typeless
and use a UNORM surface view.
Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Mon, 27 Jun 2016 17:17:45 +0000 (11:17 -0600)]
svga: add new svga_format_is_uncompressed_snorm() helper
Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Mon, 27 Jun 2016 17:16:03 +0000 (11:16 -0600)]
svga: adjust sampler view format for RGBX
We previously handled the case of a RGBX sampler view of a RGBA surface.
Add the reverse case too. For GL_ARB_copy_image.
Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Mon, 27 Jun 2016 17:15:07 +0000 (11:15 -0600)]
svga: adjust render target view format for RGBX
For GL_ARB_copy_image we may be asked to create an RGBA view of
a RGBX surface. Use an RGBX view format for that case.
Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Neha Bhende [Thu, 2 Jun 2016 21:35:20 +0000 (14:35 -0700)]
svga: don't advertise support for R32G32B32_UINT/SINT surface formats
We want to be able to copy between different 32-bit, 3-channel surface
formats for GL_ARB_copy_image but since we don't support R32G32B32_FLOAT
for textures (it's not blendable and wouldn't work for render to texture)
we can't support 32-bit, 3-channel integer formats.
The state tracker will choose 4-channel formats instead.
Fixes the piglit arb_copy_image-format test for several cases.
Note: This change may need to be revisited if/when the texture_view exension
is enabled in driver.
Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Thu, 23 Jun 2016 02:42:37 +0000 (20:42 -0600)]
svga: use untyped surface formats in most cases
This allows us to do copies between different, but compatible, surface
formats such as RGBA8_UNORM, RGBA8_SINT, RGBA8_UINT, etc. for
GL_ARB_copy_image.
Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Thu, 23 Jun 2016 14:11:25 +0000 (08:11 -0600)]
gallium/util: add tight_format_check param to util_can_blit_via_copy_region()
The VMware driver will use this for implementing GL_ARB_copy_image.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Wed, 8 Jun 2016 20:41:48 +0000 (14:41 -0600)]
gallium/util: simplify a few things in util_can_blit_via_copy_region()
Since only the src box can have negative dims for flipping, just
comparing the src/dst box sizes is enough to detect flips.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Wed, 8 Jun 2016 20:36:08 +0000 (14:36 -0600)]
gallium/util: new util_try_blit_via_copy_region() function
Pulled out of the util_try_blit_via_copy_region() function. Subsequent
changes build on this.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Neha Bhende [Tue, 28 Jun 2016 19:59:19 +0000 (12:59 -0700)]
svga: Fix failures caused in fedora 24
SVGA_3D_CMD_DX_GENRATE_MIPMAP & SVGA_3D_CMD_DX_SET_PREDICATION commands
are not presents in fedora 24 kernel module. Because of this
reason application like supertuxkart are not running.
v2: Add few comments and code modifications suggested by Brian P.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Fri, 20 May 2016 20:24:32 +0000 (14:24 -0600)]
st/wgl: remove unneeded inline qualifiers
No effect on size of the .o files (optimized build).
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Fri, 20 May 2016 20:24:59 +0000 (14:24 -0600)]
st/wgl: add a stw_device::initialized field
Set when the stw_dev object's initialization is completed. We test
for this in the window callback function to avoid potential crashes
on start-up in multi-threaded applications.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Brian Paul [Fri, 20 May 2016 20:16:18 +0000 (14:16 -0600)]
st/wgl: refactor framebuffer locking code
Split the old stw_framebuffer_reference() function into two new
functions: stw_framebuffer_reference_locked() which increments
the refcount and stw_framebuffer_release_locked() which decrements
the refcount and destroys the buffer when the count hits zero.
Original patch by Jose. Modified by Brian (clean-ups, lock assertion
checks, etc).
Reviewed-by: José Fonseca <jfonseca@vmware.com>
José Fonseca [Fri, 20 May 2016 18:11:56 +0000 (12:11 -0600)]
st/wgl: rename curctx to old_ctx in stw_make_current()
Reviewed-by: Brian Paul <brianp@vmware.com>
Brian Paul [Thu, 12 May 2016 22:33:30 +0000 (16:33 -0600)]
st/wgl: release the pbuffer DC at the end of wglBindTexImageARB()
Otherwise we were leaking DC GDI objects and if wglBindTexImageARB()
was called enough we'd eventually hit the GDI limit of 10,000 objects.
Things started failing at that point.
v2: also release DC if we return early, per Charmaine.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Matt Turner [Mon, 27 Jun 2016 23:31:09 +0000 (16:31 -0700)]
mesa: Close fp on error path.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Matt Turner [Thu, 26 May 2016 19:09:33 +0000 (12:09 -0700)]
i965: Simplify foreach_inst_in_block_safe() macro.
We know what the end looks like without examining .tail: it's NULL. It's
always NULL.
Andres Gomez [Wed, 29 Jun 2016 13:02:27 +0000 (16:02 +0300)]
Revert "i965: get PrimitiveMode from the program rather than the shader struct"
This reverts commit
644e015f0b9236e955d679cac4bcc7a1523fc475.
PrimitiveMode from the program doesn't always hold a valid value that
is neither of GL_TRIANGLES, GL_QUADS nor GL_ISOLINES when reaching
this code. This caused regressions in the following CTS tests:
GL44-CTS.stencil_texturing.functional
GL44-CTS.shading_language_420pack.binding_images
GL44-CTS.shading_language_420pack.binding_samplers
GL44-CTS.shading_language_420pack.binding_uniform_single_block
GL44-CTS.shading_language_420pack.implicit_conversions
GL44-CTS.shading_language_420pack.initializer_list
GL44-CTS.shading_language_420pack.length_of_vector_and_matrix
GL44-CTS.shading_language_420pack.line_continuation
Hence, we rather take it from the linked shader.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Timothy Arceri [Thu, 30 Jun 2016 04:44:59 +0000 (14:44 +1000)]
glsl/mesa: move duplicate shader fields into new struct gl_shader_info
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Timothy Arceri [Mon, 27 Jun 2016 23:41:59 +0000 (09:41 +1000)]
glsl/main: remove unused params and make function static
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Timothy Arceri [Mon, 27 Jun 2016 21:52:46 +0000 (07:52 +1000)]
glsl: simplify link_uniform_blocks()
There is only ever one shader so simplify the input params.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Timothy Arceri [Thu, 30 Jun 2016 04:55:40 +0000 (14:55 +1000)]
glsl/mesa: split gl_shader in two
There are two distinctly different uses of this struct. The first
is to store GL shader objects. The second is to store information
about a shader stage thats been linked.
The two uses actually share few fields and there is clearly confusion
about their use. For example the linked shaders map one to one with
a program so can simply be destroyed along with the program. However
previously we were calling reference counting on the linked shaders.
We were also creating linked shaders with a name even though it
is always 0 and called the driver version of the _mesa_new_shader()
function unnecessarily for GL shader objects.
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Timothy Arceri [Thu, 30 Jun 2016 04:54:22 +0000 (14:54 +1000)]
mesa: don't print name in _mesa_append_uniforms_to_file()
This is only used to print linked shaders which always have a name of 0
so this was pointless.
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Timothy Arceri [Thu, 30 Jun 2016 04:52:21 +0000 (14:52 +1000)]
mesa: remove unreachable code from _mesa_write_shader_to_file()
_mesa_write_shader_to_file() is only used to print gl shader objects
so Program should never be set as it only gets set for linked shaders.
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Timothy Arceri [Mon, 27 Jun 2016 06:50:00 +0000 (16:50 +1000)]
glsl: pass symbols to find_matching_signature() rather than shader
This will allow us to later split gl_shader into two structs.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Timothy Arceri [Mon, 27 Jun 2016 06:25:00 +0000 (16:25 +1000)]
glsl: pass symbols rather than shader to _mesa_get_main_function_signature()
This will allow us to split gl_shader into two different structs, one for
shader objects and one for linked shaders.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Timothy Arceri [Mon, 27 Jun 2016 06:10:35 +0000 (16:10 +1000)]
mesa: don't use drivers NewShader function when creating shader objects
The drivers function only needs to be used when creating a struct for
linked shaders.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Timothy Arceri [Mon, 27 Jun 2016 05:38:51 +0000 (15:38 +1000)]
glsl: make cross_validate_globals() more generic
Rather than passing in gl_shader we now pass in the IR. This will
allow us to later split gl_shader into two structs. One for use
as a linked per stage shader struct and one for use as a GL shader
object.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Ian Romanick [Fri, 17 Jun 2016 02:51:15 +0000 (19:51 -0700)]
mapi: Export all GLES 3.1 functions in libGLESv2.so
Khronos recommends that the GLES 3.1 library also be called libGLESv2.
It also requires that functions be statically linkable from that
library.
NOTE: Mesa has supported the EGL_KHR_get_all_proc_addresses extension
since at least Mesa 10.5, so applications targeting Linux should use
eglGetProcAddress to avoid problems running binaries on systems with
older, non-GLES 3.1 libGLESv2 libraries.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Cc: Mike Gorchak <mike.gorchak.qnx@gmail.com>
Reported-by: Mike Gorchak <mike.gorchak.qnx@gmail.com>
Acked-by: Chad Versace <chad.versace@intel.com>
Chad Versace [Mon, 27 Jun 2016 18:33:36 +0000 (11:33 -0700)]
i965: Use drmIoctl for DRM_I915_GETPARAM (v2)
Stop using drmCommandWriteRead for such a simple ioctl.
v2: Handle errno correctly. [ickle]
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
sonjiang [Tue, 28 Jun 2016 15:23:41 +0000 (11:23 -0400)]
radeon/uvd: fix a h265 context size bug
Signed-off-by: sonjiang <sonny.jiang@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
sonjiang [Mon, 27 Jun 2016 21:19:01 +0000 (17:19 -0400)]
radeon/uvd: seperate uvd context buffer from DPB
Signed-off-by: sonjiang <sonny.jiang@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
sonjiang [Wed, 29 Jun 2016 15:24:36 +0000 (11:24 -0400)]
radeon uvd add uvd fw version for amdgpu
Signed-off-by: sonjiang <sonny.jiang@amd.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Samuel Pitoiset [Wed, 29 Jun 2016 13:35:44 +0000 (15:35 +0200)]
nv50/ir: print EMIT subops in debug mode
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Samuel Pitoiset [Wed, 29 Jun 2016 13:34:35 +0000 (15:34 +0200)]
nv50/ir: print RSQ/RCP subops in debug mode
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Samuel Pitoiset [Wed, 29 Jun 2016 13:25:16 +0000 (15:25 +0200)]
nv50/ir: print PIXLD subops in debug mode
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Samuel Pitoiset [Wed, 29 Jun 2016 13:16:35 +0000 (15:16 +0200)]
nv50/ir: print SHFL subops in debug mode
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Rodrigo Vivi [Thu, 23 Jun 2016 21:38:18 +0000 (14:38 -0700)]
i965: Removing PCI IDs that are no longer listed as Kabylake.
This is unusual. Usually IDs listed on early stages of platform
definition are kept there as reserved for later use.
However these IDs here are not listed anymore in any of steppings
and devices IDs tables for Kabylake on configurations overview
section of BSpec.
So it is better removing them before they become used in any
other future platform.
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Rodrigo Vivi [Thu, 23 Jun 2016 21:35:09 +0000 (14:35 -0700)]
i956: Add more Kabylake PCI IDs.
The spec has been updated adding new PCI IDs.
Reviewed-by: Dhinakaran Pandiyan <dhinakaran.pandiyan@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Marek Olšák [Mon, 27 Jun 2016 17:47:27 +0000 (19:47 +0200)]
gallium/radeon: remove zombie textures kept alive by DCC stat gathering
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 27 Jun 2016 17:46:39 +0000 (19:46 +0200)]
gallium/radeon: don't re-create queries for DCC stat gathering
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 27 Jun 2016 17:45:30 +0000 (19:45 +0200)]
gallium/radeon: assume X11 DRI3 can use at most 5 back buffers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 28 Jun 2016 19:02:40 +0000 (21:02 +0200)]
gallium/radeon: separate DCC starts as disabled (ps_draw_ratio = 0)
DRI3:
- Only slows clears can enable it for the first frame.
- A good PS/draw ratio can enable it for other frames.
DRI2:
- Only slows clears can enable it for a frame.
- Page-flipped color buffers are unref'd at the end of each frame,
so it can't be enabled in any other way.
- Relying on slow clears is sufficient for our synthetic benchmarks.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 27 Jun 2016 17:44:45 +0000 (19:44 +0200)]
gallium/radeon: R600_DEBUG=nodccfb disables separate DCC
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 21 Jun 2016 19:17:44 +0000 (21:17 +0200)]
gallium/radeon: add and use r600_texture_reference
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Vedran Miletić <vedran@miletic.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 21 Jun 2016 17:36:14 +0000 (19:36 +0200)]
gallium/radeon: add a HUD query for PS draw ratio stats from separate DCC
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 21 Jun 2016 16:18:46 +0000 (18:18 +0200)]
gallium/radeon: add a heuristic enabling DCC for scanout surfaces (v2)
DCC for displayable surfaces is allocated in a separate buffer and is
enabled or disabled based on PS invocations from 2 frames ago (to let
queries go idle) and the number of slow clears from the current frame.
At least an equivalent of 5 fullscreen draws or slow clears must be done
to enable DCC. (PS invocations / (width * height) + num_slow_clears >= 5)
Pipeline statistic queries are always active if a color buffer that can
have separate DCC is bound, even if separate DCC is disabled. That means
the window color buffer is always monitored and DCC is enabled only when
the situation is right.
The tracking of per-texture queries in r600_common_context is quite ugly,
but I don't see a better way.
The first fast clear always enables DCC. DCC decompression can disable it.
A later fast clear can enable it again. Enable/disable typically happens
only once per frame.
The impact is expected to be negligible because games usually don't have
a high level of overdraw. DCC usually activates when too much blending
is happening (smoke rendering) or when testing glClear performance and
CMASK isn't supported (Stoney).
v2: rename stuff, add assertions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 21 Jun 2016 14:16:15 +0000 (16:16 +0200)]
gallium/radeon: add state setup for a separate DCC buffer
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 21 Jun 2016 14:09:33 +0000 (16:09 +0200)]
radeonsi: always calculate DCC info even if it's not used immediately
for a later use
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 21 Jun 2016 13:52:03 +0000 (15:52 +0200)]
radeonsi: unreference framebuffer state with set_framebuffer_state
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 21 Jun 2016 13:49:25 +0000 (15:49 +0200)]
gallium/radeon: add flag R600_QUERY_HW_FLAG_BEGIN_RESUMES
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Chad Versace [Mon, 27 Jun 2016 18:50:17 +0000 (11:50 -0700)]
i965: Use intel_get_param() more often
Replace some open-coded ioctls with intel_get_param().
This is just a cleanup. No change in behavior.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Chad Versace [Mon, 27 Jun 2016 18:29:27 +0000 (11:29 -0700)]
i965: Refactor intel_get_param()
Replace the function's __DRIscreen parameter with struct intel_screen.
The callsites feel more natural that way.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Marek Olšák [Wed, 29 Jun 2016 09:19:58 +0000 (11:19 +0200)]
radeonsi: don't advertise multisample shader images
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 28 Jun 2016 12:19:04 +0000 (14:19 +0200)]
radeonsi: enable distributed tess on multi-SE parts only
ported from Vulkan
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 28 Jun 2016 12:11:12 +0000 (14:11 +0200)]
radeonsi: set optimal VGT_HS_OFFCHIP_PARAM
ported from Vulkan
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 28 Jun 2016 11:15:45 +0000 (13:15 +0200)]
radeonsi: enable CU0 in each SE for LS-HS execution
Offchip-only tessellation allows this.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 28 Jun 2016 11:04:07 +0000 (13:04 +0200)]
radeonsi: use conformant line rasterization
AA lines are not completely correct (see TODO), but everything else
should be.
+ 3 linestipple piglits
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Rob Herring [Mon, 13 Jun 2016 18:45:53 +0000 (13:45 -0500)]
Android: add missing u_math.h include path for libmesa_isl
Commit
87d062a94080 ("i965: Fix shared local memory size for Gen9+.")
added u_math.h include which broke the Android build:
In file included from external/mesa3d/src/intel/isl/isl_storage_image.c:25:
In file included from external/mesa3d/src/mesa/drivers/dri/i965/brw_compiler.h:29:
external/mesa3d/src/mesa/main/macros.h:35:10: fatal error: 'util/u_math.h' file not found
^
Add the missing include paths for libmesa_isl.
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Kenneth Garunke <kenneth@whitecape.org>
Charmaine Lee [Thu, 23 Jun 2016 16:21:11 +0000 (09:21 -0700)]
svga: force direct map for transfering multiple slices
With commit
fb9fe35, we start using transfer_inline_write
for memcpy of TexSubImage. But SurfaceDMA command does not work
well with texture array. This patch forces direct map when
transfering multiple slices of a texture array.
Fixes piglit regression "texelFetch fs sampler1DArray"
Tested with MTT piglit, glretrace, conform.
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Brian Paul [Mon, 27 Jun 2016 17:40:00 +0000 (11:40 -0600)]
svga: whitespace, line wrapping fixes in svga_surface.c
Samuel Pitoiset [Mon, 27 Jun 2016 22:59:46 +0000 (00:59 +0200)]
gm107/ir: make sure that flagsDef is set when emitting setcond
Rely on the existence of a second destination when emitting a setcond
flag is dangerous, because this doesn't mean that the flag has been
correctly set. Instead rely on flagsDef like what emitX() does
for flagsSrc.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
Grazvydas Ignotas [Mon, 27 Jun 2016 22:33:21 +0000 (01:33 +0300)]
doc: improve INTEL_DEBUG documentation
Remove 'reg' option that does not actually exist, elaborate more about
'sync' and add the missing options.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Marek Olšák [Fri, 8 Apr 2016 10:15:50 +0000 (12:15 +0200)]
radeonsi: set PA_SU_SMALL_PRIM_FILTER_CNTL register on Polaris
This was missing.
Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Boyuan Zhang [Mon, 27 Jun 2016 18:37:16 +0000 (14:37 -0400)]
radeon/vce: use vce structure for vce 52 firmware
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Boyuan Zhang [Tue, 14 Jun 2016 17:34:29 +0000 (13:34 -0400)]
radeon/vce: add vce structures
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Leo Liu [Tue, 28 Jun 2016 00:40:30 +0000 (20:40 -0400)]
st/omx: fix decoder fillout for the OMX result buffer
The call for vl_video_buffer_adjust_size is with wrong order of
arguments, apparently it will have problem when interlaced false;
The size of OMX result buffer depends on real size of clips, vl buffer
dimension is aligned with 16, so 1080p(1920*1080) video will overflow
the OMX buffer
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Julien Isorce <j.isorce@samsung.com>
Hans de Goede [Sat, 7 May 2016 12:43:59 +0000 (14:43 +0200)]
pipe_loader_sw: Fix fd leak when instantiated via pipe_loader_sw_probe_kms
Make pipe_loader_sw_probe_kms take ownership of the passed in fd,
like pipe_loader_drm_probe_fd does.
The only caller is dri_kms_init_screen which passes in a dupped fd,
just like dri2_init_screen passes in a dupped fd to
pipe_loader_drm_probe_fd.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Jan Vesely [Thu, 23 Jun 2016 00:52:26 +0000 (20:52 -0400)]
clover: Fix kernel metadata retrieval after clang r273425
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Acked-by: Francisco Jerez <currojerez@riseup.net>
Francisco Jerez [Tue, 17 May 2016 14:02:32 +0000 (16:02 +0200)]
clover/llvm: Fix copyright attribution of invocation.cpp.
This file still only has my name on the copyright notice even though
most of the code (likely more than 90% of it) was authored by various
contributors -- It doesn't seem right to have the whole file
attributed to myself.
Acked-by: Michel Dänzer <michel.daenzer@amd.com>
Acked-by: Serge Martin <edb+mesa@sigluy.net>
Kenneth Graunke [Sun, 26 Jun 2016 07:39:32 +0000 (00:39 -0700)]
i965: Print EOT in fs_visitor::dump_instruction().
This was useful when debugging the previous commit's issue.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Kenneth Graunke [Sun, 26 Jun 2016 07:39:19 +0000 (00:39 -0700)]
i965: Make emit_urb_writes() not produce an EOT message for GS.
emit_urb_writes() contains code to emit an EOT write with no actual
data when there are no output varyings. This makes sense for the VS
and TES stages, where it's called once at the end of the program.
However, in the geometry shader stage, emit_urb_writes() is called once
for every EmitVertex(). We explicitly emit a URB write with EOT set at
the end of the shader, separately from this path. So we'd better not
terminate the thread. This could get us into trouble for shaders which
do EmitVertex() with no varyings followed by SSBO/image/atomic writes.
It also caused us to emit multiple sends with EOT set, which apparently
confuses the register allocator into not using g112-g127 for all but
the first one. This caused EU validation failures in OglGSCloth
shaders in shader-db. (The actual application was fine, but shader-db
thinks there are no outputs because it doesn't understand transform
feedback.)
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Kenneth Graunke [Fri, 24 Jun 2016 22:37:35 +0000 (15:37 -0700)]
glsl: Ignore ir_texture in lower_const_arrays_to_uniforms.
The only part of an ir_texture which can be an array is the
offsets array in textureGatherOffsets() calls. We don't want
to lower those, because they're required to remain constants.
Fixes textureGatherOffsets with Gallium drivers such as llvmpipe,
which commit
ef78df8d3b0cf540e5f08c8c2f6caa338b64a6c7 regressed.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Samuel Pitoiset [Mon, 27 Jun 2016 22:13:05 +0000 (00:13 +0200)]
gm107/ir: add missing setcond flags for LOP variants
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
Samuel Pitoiset [Mon, 27 Jun 2016 21:55:53 +0000 (23:55 +0200)]
gm107/ir: make use of LOP32I for all immediates
LOP only allows to emit 19-bits immediates.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
Dave Airlie [Mon, 27 Jun 2016 20:45:28 +0000 (06:45 +1000)]
virgl: reduce some limits for now
These need to be passed from the host in caps structure if they
are larger, this fixes a bunch of tests on Intel hw, that I'd
put the limits too high for.
Cc: "11.2 12.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Julien Isorce [Mon, 4 Jan 2016 22:17:59 +0000 (22:17 +0000)]
st/omx: count number of slices
Used by nouveau driver.
Similar patch was done for st/va:
851e7e12aa628d6781b5a3af2f2fc16ee73f435f
Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Julien Isorce [Tue, 8 Dec 2015 21:50:03 +0000 (21:50 +0000)]
st/omx: add support for nouveau / interlaced
Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Julien Isorce [Tue, 1 Dec 2015 08:10:42 +0000 (08:10 +0000)]
st/omx: retrieve preferred interlaced and buffer_formats
Interlaced can be true for nouveau driver.
Signed-off-by: Julien Isorce <j.isorce@samsung.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Marek Olšák [Fri, 8 Apr 2016 10:57:43 +0000 (12:57 +0200)]
radeonsi: use optimal WD settings for primitive restart on Polaris
ported from Vulkan
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Gurkirpal Singh [Sun, 26 Jun 2016 07:02:25 +0000 (12:32 +0530)]
st/va: Check NULL pointer
Call to handle_table_get in vlVaDestroySurfaces can
return NULL on failure.
CID:
1243522
Signed-off-by: Gurkirpal Singh <gurkirpal204@gmail.com>
Reviewed-by: Julien Isorce <j.isorce@samsung.com>
Eric Anholt [Sun, 26 Jun 2016 00:33:16 +0000 (17:33 -0700)]
nir: Fix copy_prop_src when src is an indirect access on a reg.
The intent was to continue down the indirect chain, not to call ourselves
with unchanged input arguments. Found by code inspection, and comparison
to copy_prop_alu_src().
We haven't hit this because callers of NIR's copy prop are doing so in
SSA, before indirect variable dereferences have been lowered to registers.
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Samuel Pitoiset [Sun, 26 Jun 2016 22:52:46 +0000 (00:52 +0200)]
gm107/ir: make use of MOV32I for all immediates
MOV only allows to emit 19-bits immediates. This is similar to the
previous fix I did for IMUL.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
Jordan Justen [Sun, 12 Jun 2016 01:16:47 +0000 (18:16 -0700)]
i965: Use miptree to decide format on multi-plane images for gen < 7
This wasn't handled correctly for multi-plane images on gen < 7 in
727a9b24933d384f5440ed4318fb720ed11d6dd1.
Reported-by: Mark Janes <mark.a.janes@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96674
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "12.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Ilia Mirkin [Sat, 25 Jun 2016 22:15:57 +0000 (18:15 -0400)]
nvc0: update "derived" state function names
derived_1/2/etc aren't too informative. Instead name them based on the
state they're derived from.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Sat, 25 Jun 2016 19:20:00 +0000 (15:20 -0400)]
nvc0: provide support for unscaled poly offset units
On at least Kepler hardware, the units differ based on RT format. Emit a
properly scaled value for Z16 depth buffers vs other formats, to help
out st/nine.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Samuel Pitoiset [Sun, 26 Jun 2016 16:42:22 +0000 (18:42 +0200)]
gm107/ir: make use of IMUL32I for all immediates
IMUL only allows to emit 19-bits immediates. This is similar to
d30768025a2283d4cc57930b784798bf278969da which fixed the same thing
for the GK110 emitter.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
Marek Olšák [Tue, 21 Jun 2016 19:46:16 +0000 (21:46 +0200)]
radeonsi: make si_is_format_supported static
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Vedran Miletić <vedran@miletic.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 21 Jun 2016 19:29:39 +0000 (21:29 +0200)]
radeonsi: boolean -> bool, TRUE -> true, FALSE -> false
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Vedran Miletić <vedran@miletic.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 21 Jun 2016 19:29:39 +0000 (21:29 +0200)]
gallium/radeon: boolean -> bool, TRUE -> true, FALSE -> false
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Vedran Miletić <vedran@miletic.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 21 Jun 2016 19:29:39 +0000 (21:29 +0200)]
gallium/radeon/winsyses: boolean -> bool, TRUE -> true, FALSE -> false
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Vedran Miletić <vedran@miletic.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Tue, 21 Jun 2016 19:13:00 +0000 (21:13 +0200)]
gallium/radeon: use r600_resource_reference
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Vedran Miletić <vedran@miletic.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Jason Ekstrand [Thu, 23 Jun 2016 21:22:03 +0000 (14:22 -0700)]
nir: Add a NIR_VALIDATE environment variable
It defaults to true so default behavior doesn't change but it allows you to
do NIR_VALIDATE=false if you don't want validation. Disabling validation
can substantially speed up shader compiles so you frequently want to turn
it off if compiler invariants aren't in question.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Rob Clark <robclark@freedesktop.org>