Dave Airlie [Mon, 30 Jan 2017 19:30:26 +0000 (05:30 +1000)]
radv: add sample mask input support
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 23 Feb 2017 04:24:20 +0000 (14:24 +1000)]
radv: enable location at sample when persample is forced.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Thu, 23 Feb 2017 04:24:20 +0000 (14:24 +1000)]
radv: fix interpolation at wrong place for offset interp
The code was interpolating at the offset from the sample,
not the offset from the center. Also fix for persample interpolation
modes we should force the pixel center to be at the sample.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
George Kyriazis [Fri, 3 Feb 2017 03:16:47 +0000 (21:16 -0600)]
swr: fix index buffers with non-zero indices
Fix issue with index buffers that do not contain a 0 index. 0 index
can be a non-valid index if the (copied) vertex buffers are a subset of the
user's (which happens because we only copy the range between min & max).
Core will use an index passed in from the driver to replace invalid indices.
Only do this for calls that contain non-zero indices, to minimize performance
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
cost.
George Kyriazis [Fri, 10 Feb 2017 16:24:32 +0000 (10:24 -0600)]
swr: add fetch shader cache
For now, the cache key is all of FETCH_COMPILE_STATE.
Use new/delete for swr_vertex_element_state, since we have to call the
constructors/destructors of the struct elements.
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Timothy Arceri [Thu, 23 Feb 2017 03:50:58 +0000 (14:50 +1100)]
st/mesa: free shader cache buffer on fallback
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Timothy Arceri [Thu, 23 Feb 2017 03:42:07 +0000 (14:42 +1100)]
st/mesa: fix crash in shader cache cased by race condition
If a thread doesn't load GLSL IR from cache but does load TGSI
from cache (that was created by another thread) than it will
crash due to expecting gl_program_parameter_list to have been
restored from the GLSL IR cache and not be null.
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Jason Ekstrand [Fri, 17 Feb 2017 22:14:48 +0000 (14:14 -0800)]
anv: Enable MSAA compression
This just enables basic MSAA compression (no fast clears) for all
multisampled surfaces. This improves the framerate of the Sascha
"multisampling" demo by 76% on my Sky Lake laptop. Running Talos on
medium settings with 8x MSAA, this improves the framerate in the
benchmark by 80%.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Wed, 22 Feb 2017 02:28:38 +0000 (18:28 -0800)]
anv/blorp/clear_subpass: Only set surface clear color for fast clears
Not all clear colors are valid. In particular, on Broadwell and
earlier, only 0/1 colors are allowed in surface state. No CTS tests are
affected outright by this because, apparently, the CTS coverage for
different clear colors is pretty terrible. However, when multisample
compression is enabled, we do hit it with CTS tests and this commit
prevents regressions when enabling MCS on Broadwell and earlier.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Pohjolainen, Topi [Thu, 23 Feb 2017 13:31:44 +0000 (15:31 +0200)]
intel/isl: Apply render target alignment constraints for MCS
v2: Instead of having the same block in isl_gen7,8,9.c add it
once into isl.c::isl_choose_image_alignment_el() instead.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Lionel Landwerlin [Mon, 20 Feb 2017 16:10:30 +0000 (16:10 +0000)]
intel/isl: add MCS width constraint 16 samples
v3 (Jason Ekstrand): Add a comment explaining why
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Jason Ekstrand [Fri, 17 Feb 2017 21:48:11 +0000 (13:48 -0800)]
intel/isl: Return surface creation success from aux helpers
The isl_surf_init call that each of these helpers make can, in theory,
fail. We should propagate that up to the caller rather than just
silently ignoring it.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Kenneth Graunke [Thu, 23 Feb 2017 01:16:01 +0000 (17:16 -0800)]
glsl: Raise a link error for non-SSO ES programs with a TES but no TCS.
OpenGL allows the TCS to be missing and supplies an implicit passthrough
shader, but OpenGL ES does not (see section 7.3 of the ES 3.2 spec,
cited above in the code).
One open question is how to handle this for ARB_ES3_2_compatibility.
This patch raises the link error for all ES shading language programs,
but it might make sense to base it on the API. The approach taken in
this patch is more restrictive, but should still allow any valid ES
programs to work in GL.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Samuel Iglesias Gonsálvez [Wed, 22 Feb 2017 11:27:15 +0000 (12:27 +0100)]
isl/state: fix assert on raw buffer surface state minimum size
From IVB PRM, SURFACE_STATE::Height:
"For typed buffer and structured buffer surfaces, the number of
entries in the buffer ranges from 1 to 2^27 . For raw buffer
surfaces, the number of entries in the buffer is the number of bytes
which can range from 1 to 2^30."
The minimum value is 1, according to the spec. The spec quote
was already added into the code by
028f6d8317f00.
Fixes crashing tests under:
dEQP-VK.robustness.buffer_access.*
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Iago Toral Quiroga [Wed, 22 Feb 2017 08:06:31 +0000 (09:06 +0100)]
glsl: enable early_fragment_tests implicitly with post_depth_coverage
From ARB_post_depth_coverage:
"This extension allows the fragment shader to control whether values in
gl_SampleMaskIn[] reflect the coverage after application of the early
depth and stencil tests. This feature can be enabled with the following
layout qualifier in the fragment shader:
layout(post_depth_coverage) in;
Use of this feature implicitly enables early fragment tests."
And a bit later it also adds:
"early_fragment_tests" requests that fragment tests be performed before
fragment shader execution, as described in section 15.2.4 "Early Fragment
Tests" of the OpenGL Specification. If neither this nor post_depth_coverage
are declared, per-fragment tests will be performed after fragment shader
execution."
Fixes:
GL45-CTS.post_depth_coverage_tests.PostDepthSampleMask
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Iglesias Gonsálvez [Thu, 9 Feb 2017 13:14:29 +0000 (14:14 +0100)]
glsl: refactor get_variable_being_redeclared() to return always an ir_variable pointer
It will return the current variable ('var') or the earlier declaration ('earlier') in
case of redeclaration of that variable.
In order to distinguish between both, 'is_redeclaration' boolean will indicate in which
case we are.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Samuel Iglesias Gonsálvez [Thu, 9 Feb 2017 12:54:46 +0000 (13:54 +0100)]
glsl: fix heap-use-after-free in ast_declarator_list::hir()
The get_variable_being_redeclared() function can free 'var' because
a re-declaration of an unsized array variable can establish the size, so
we set the array type to the 'earlier' declaration and free 'var' as it is
not needed anymore.
However, the same 'var' is referenced later in ast_declarator_list::hir().
This patch fixes it by picking the ir_variable_mode from the proper
ir_variable.
This error was detected by Address Sanitizer.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99677
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Charmaine Lee [Sat, 18 Feb 2017 09:26:52 +0000 (01:26 -0800)]
st/wgl: flush with ST_FLUSH_WAIT before releasing shared contexts
Before releasing a shared context, flush the context
with ST_FLUSH_WAIT to make sure all commands are executed.
This ensures that rendering to any shared resources is completed
before they will be referenced by another context.
Fixes an intermittent flickering with Photoshop. (VMware bug#
1779340)
Reviewed-by: Brian Paul <brianp@vmware.com>
Charmaine Lee [Sat, 18 Feb 2017 09:19:23 +0000 (01:19 -0800)]
st: add ST_FLUSH_WAIT to st_context_flush()
When st_context_flush() is called with ST_FLUSH_WAIT,
the function will return after the fence is completed.
Reviewed-by: Brian Paul <brianp@vmware.com>
Dave Airlie [Tue, 21 Feb 2017 04:13:20 +0000 (14:13 +1000)]
radv/ac: handle gs->copy shader clip distances.
This fixes up the clip distance passing between the geometry
shader and the copy shader. It packs the clip and cull distances
into one or two consecutive slots, and avoids wasting space and
make sure the gs output and copy shader input agree on where
things are stored.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 21 Feb 2017 04:09:11 +0000 (14:09 +1000)]
radv/ac: pass clips properly from vertex->geometry shader stages.
This works out the geometry shader clip/cull inputs separately
to the outputs, and uses that information to read from the ES->GS
ring buffer. It stores the clip/cull distances packed into one
or two slots. It fixes the es output emission and gs input
reading to match.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 21 Feb 2017 03:27:52 +0000 (13:27 +1000)]
radv/ac: rename num clips/cull to output clips/culls
As geom shaders can have different ones on entry and exit.
also move to uint8_t as these are never that big.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Dave Airlie [Tue, 21 Feb 2017 01:04:44 +0000 (11:04 +1000)]
vulkan/wsi: move image count to shared structure.
For prime support I need to access this, so move it in advance.
[airlied: fix int->uint32_t]
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Timothy Arceri [Thu, 23 Feb 2017 02:35:03 +0000 (13:35 +1100)]
radeon: fix r600 builds when old version of llvm is present
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Dylan Baker [Thu, 23 Feb 2017 00:37:10 +0000 (16:37 -0800)]
vulkan: Fix gen_enum_to_str in out of tree builds
In some configurations the util directory is created when building out
of tree, but not others. This patch ensures that it's created.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-and-Tested-by: Mike Lothian <mike@fireburn.co.uk>
Jason Ekstrand [Wed, 22 Feb 2017 03:37:53 +0000 (19:37 -0800)]
anv/Makefile: Gather all the genX files into one place
While we're here, we also fix the alphabetization of the list of
genx_* files.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Timothy Arceri [Mon, 20 Feb 2017 22:57:09 +0000 (09:57 +1100)]
r600/radeonsi: enable glsl/tgsi on-disk cache
For gpu generations that use LLVM we create a timestamp string
containing both the LLVM and Mesa build times, otherwise we just
use the Mesa build time.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Mon, 20 Feb 2017 22:00:49 +0000 (09:00 +1100)]
st/mesa: get on-disk shader cache
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Tue, 21 Feb 2017 22:18:09 +0000 (09:18 +1100)]
ddebug/rbug/trace: add get_disk_shader_cache() to pass-throughs
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Sun, 19 Feb 2017 22:29:01 +0000 (09:29 +1100)]
gallium: add get_disk_shader_cache() callback
V2: Provide more detail in callback description and add description to
screen.rst
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Wed, 1 Feb 2017 04:52:27 +0000 (15:52 +1100)]
st/mesa: implement a tgsi on-disk shader cache
Implements a tgsi cache for the OpenGL state tracker.
V2: add support for compute shaders
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Wed, 22 Feb 2017 00:33:44 +0000 (11:33 +1100)]
st/mesa: add sha1 field to st program structs
This will be used to share the sha1 computed by the tgsi load
function with the tgsi write function.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Tue, 21 Feb 2017 23:59:13 +0000 (10:59 +1100)]
st/mesa: move set_prog_affected_state_flags() to st_program.c
We want to use this in the new tgsi shader cache so we move it here
and make it available externally.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Wed, 22 Feb 2017 03:16:04 +0000 (14:16 +1100)]
util/disk_cache: fix bug with deleting old cache dirs
If there was more than a single directory in the .cache/mesa dir
then it would only remove one (or none) of the directories.
Apparently Valgrind was also reporting:
Conditional jump or move depends on uninitialised value
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Dylan Baker [Fri, 17 Feb 2017 19:57:24 +0000 (11:57 -0800)]
vulkan: Combine wsi and util makefiles
Reviewed-by: Matt Turner <mattst88@gmail.com>
Dylan Baker [Wed, 15 Feb 2017 23:41:50 +0000 (15:41 -0800)]
vulkan/util: Add generator for enum_to_str functions
This adds a python generator to produce enum_to_str functions for
Vulkan from the vk.xml API description. It supports extensions as well
as core API features, and the generator works with both python2 and
python3.
Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Thomas Hellstrom [Wed, 22 Feb 2017 20:42:01 +0000 (21:42 +0100)]
Revert "st/vdpau: Fix multithreading"
This reverts commit
f1e5dfbe3c8951a6c8acf41bf5e6c2d090098b2c.
For a detailed discussion see
https://lists.freedesktop.org/archives/mesa-dev/2017-February/145283.html
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Nayan Deshmukh [Wed, 22 Feb 2017 08:25:02 +0000 (13:55 +0530)]
vl: u_upload_alloc might fail to allocate buffer in bicubic filter
Signed-off-by: Nayan Deshmukh <nayan26deshmukh@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Marek Olšák [Mon, 20 Feb 2017 18:34:02 +0000 (19:34 +0100)]
gallium: reorder fields in pipe_draw_info
sizeof(struct pipe_draw_info) = 104 -> 88
Also, vertices_per_patch is switched to ubyte, because it can't be more
than 32.
Seemed-reasonable-to: Roland Scheidegger
Marek Olšák [Sun, 19 Feb 2017 18:29:06 +0000 (19:29 +0100)]
gallium/hud: handle a thread switch for API-thread-busy monitoring
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 19 Feb 2017 18:28:14 +0000 (19:28 +0100)]
gallium/hud: prevent an infinite loop
v2: use UINT64_MAX / 11
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 20 Feb 2017 17:42:41 +0000 (18:42 +0100)]
gallium/u_queue: isolate util_queue_fence implementation
it's cleaner this way.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Mon, 20 Feb 2017 14:27:07 +0000 (15:27 +0100)]
gallium/u_queue: fix random crashes when the app calls exit()
This fixes:
vdpauinfo: ../lib/CodeGen/TargetPassConfig.cpp:579: virtual void
llvm::TargetPassConfig::addMachinePasses(): Assertion `TPI && IPI &&
"Pass ID not registered!"' failed.
v2: use list_head, switch the call order in destroy
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Robert Bragg [Fri, 13 Mar 2015 22:10:47 +0000 (22:10 +0000)]
i965: Implement INTEL_performance_query backend
This adds a bare-bones backend for the INTEL_performance_query extension
that exposes pipeline statistics.
Although this could be considered redundant given that the same
statistics are already available via query objects, they are a simple
starting point for this extension and it's expected to be convenient for
tools wanting to have a single go to api to introspect what performance
counters are available, along with names, descriptions and semantic/data
types.
This code is derived from Kenneth Graunke's work, temporarily removed
while the frontend and backend interface were reworked.
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Robert Bragg [Wed, 29 Apr 2015 07:41:34 +0000 (08:41 +0100)]
mesa: Model INTEL perf query backend after query obj BE
Instead of using the same backend interface as AMD_performance_monitor
this defines a dedicated INTEL_performance_query interface that is
modelled more on the ARB_query_buffer_object interface (considering the
similarity of the extensions) with the addition of vfuncs for
initializing and enumerating query and counter info.
Compared to the previous backend, some notable differences are:
- The backend is free to represent counters using whatever data
structures are optimal/convenient since queries and counters are
enumerated via an iterator api instead of declaring them using
structures directly shared with the frontend.
This is also done to help us support the full range of data and
semantic types available with INTEL_performance_query which is awkward
while using a structure shared with the AMD_performance_monitor
backend since neither extension's types are a subset of the other.
- The backend must support waiting for a query instead of the frontend
simply using glFinish().
- Objects go through 'Active' and 'Ready' states consistent with the
query object backend (hopefully making them more familiar). There is
no 'Ended' state (which used to show that a query has ended at least
once for a given object). There is a new 'Used' state, set when a
query is first begun which implies that we are expecting to get
results back for the object at some point. There's no equivalent to
the 'EverBound' state since the spec doesn't require there to be a
limbo state between generating IDs and associating them with an object
on query Begin.
The INTEL_performance_query and AMD_performance_monitor extensions are
now completely orthogonal within Mesa main (though a driver could
optionally choose to implement both extensions within a unified backend
if that were convenient for the sake of sharing state/code).
v2: (Samuel Pitoiset)
- init PerfQuery.NumQueries in frontend
- s/return_string/output_clipped_string/
- s/backed/backend/ typo
- remove redundant *bytesWritten = 0
v3:
- Add InitPerfQueryInfo for lazy probing of available queries
v4:
- Clean up some internal usage of GL typedefs (Ken)
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Robert Bragg [Wed, 29 Apr 2015 07:17:20 +0000 (08:17 +0100)]
mesa: Separate INTEL_performance_query frontend
To allow the backend interfaces for AMD_performance_monitor and
INTEL_performance_query to evolve independently based on the more
specific requirements of each extension this starts by separating
the frontends of these extensions.
Even though there wasn't much tying these frontends together, this
separation intentionally copies what few helpers/utilities that were
shared between the two extensions, avoiding any re-factoring specific to
INTEL_performance_query so that the evolution will be easier to follow
later.
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Thomas Hellstrom [Fri, 23 Sep 2016 11:23:05 +0000 (13:23 +0200)]
gallium/vl: Simplify the matrix filter fragment shader
It looks like it was partly copied from the median filter fragment shader
and unnecessesarily saved a lot of temporary values.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Thomas Hellstrom [Fri, 1 Apr 2016 05:31:18 +0000 (07:31 +0200)]
st/vdpau: Fix multithreading
The vdpau state tracker allows multiple threads access to the same gallium
context simultaneously. We can fix this either by locking the same mutex
each time the context is used or by using a different gallium context for
each mutex domain. Here we do the latter, although I'm not sure that's really
the best option.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Acked-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Thomas Hellstrom [Fri, 4 Mar 2016 11:11:23 +0000 (12:11 +0100)]
gallium/vl: Parameter substitution in the csc matrix computation
Makes the code significantly more readable.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Thomas Hellstrom [Fri, 4 Mar 2016 10:49:31 +0000 (11:49 +0100)]
gallium/vl: Simplify usage of full range matrices
When looking at the full range matrices, it becomes obvious that the difference
between the standard matrices and the full range matrices is that the full
range matrices are multiplied by 1.164. Together with offsetting the y value
with -16/255, this will scale and offset RGB with the desired quantities.
However, the standard SMPTE 240M matrix seems to differ a bit since the
U and V coefficients are only multiplied with 1.138 to get the full range
matrix. This would actually alter the color somewhat so I figure that's an
error. The full range matrix is consistent with Nvidia's VDPAU implementation.
We can also incorporate the ybias in the brightness simplifying the
calculation somewhat.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Thomas Hellstrom [Tue, 8 Mar 2016 06:30:52 +0000 (07:30 +0100)]
gallium/vl Fix brightness matrix description
The brightness matrix doesn't actually match the procamp matrix and
what's calculated in vl_csc_get_matrix.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Thomas Hellstrom [Mon, 29 Feb 2016 12:33:39 +0000 (13:33 +0100)]
gallium/vl: Don't map vertex buffers on creation
It will cause multiple simultaneous maps of the same vertex buffer and
flushed-while-mapped warnings.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Thomas Hellstrom [Fri, 23 Sep 2016 10:57:54 +0000 (12:57 +0200)]
gallium/vl: Add sampler views to video filter fragment shaders
Needed for at least the svga driver.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Thomas Hellstrom [Thu, 18 Aug 2016 11:41:42 +0000 (13:41 +0200)]
gallium/vl: declare sampler views in compositor shaders
The svga driver relies on the existence of these sampler views.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Brian Paul [Tue, 21 Feb 2017 22:52:40 +0000 (15:52 -0700)]
util: fix MSVC build issue in disk_cache.h
Windows doesn't have dlfcn.h. Protect the code in question
with #if ENABLE_SHADER_CACHE test. And fix indentation.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Dave Airlie [Wed, 22 Feb 2017 02:20:18 +0000 (02:20 +0000)]
radv: fix typo in the subpass barrier patch.
Fixes: dbb0eaccc radv: handle subpass cache flushes
Signed-off-by: Dave Airlie <airlied@redhat.com>
Rafael Antognolli [Fri, 20 Jan 2017 17:53:27 +0000 (09:53 -0800)]
i965/gen6+: Enable arb_transform_feedback_overflow_query.
This extension adds new query types which can be used to detect overflow
of transform feedback buffers. The new query types are also accepted by
conditional rendering commands.
v3:
- s/gen7+/gen6+/ in the relnotes (Jordan Justen)
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Rafael Antognolli [Fri, 20 Jan 2017 17:53:26 +0000 (09:53 -0800)]
i965: Add support for xfb overflow query on conditional render.
Enable the use of a transform feedback overflow query with
glBeginConditionalRender. The render commands will only execute if the
query is true (i.e. if there was an overflow).
Use ARB_conditional_render_inverted to change this behavior.
v4:
- reuse MI_MATH calcs from hsw_queryob (Kenneth)
- fallback to software conditional rendering when MI_MATH is not
available (Kenneth)
v5:
- check query->Target (Kenneth)
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Rafael Antognolli [Fri, 20 Jan 2017 17:53:25 +0000 (09:53 -0800)]
i965: Add support for xfb overflow on query buffer objects.
Enable getting the results of a transform feedback overflow query with a
buffer object.
v4:
- hsw_overflow_result_to_gpr0 a public function, so it can be used
by conditional render. (Kenneth)
- fix typo grp0/gpr0 (Kenneth)
- rename load_gen_written_data_to_regs to
load_overflow_data_to_cs_gprs (Kenneth)
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Rafael Antognolli [Fri, 20 Jan 2017 17:53:24 +0000 (09:53 -0800)]
i965: add plumbing for ARB_transform_feedback_overflow_query.
When querying for transform feedback overflow on one or all of the
streams, store information about number of generated and written
primitives. Then check whether generated == written.
v2:
- use only SO_PRIM_STORAGE_NEEDED, do not fallback to
CL_INVOCATION_COUNT. (Kenneth)
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Rafael Antognolli [Fri, 20 Jan 2017 17:53:23 +0000 (09:53 -0800)]
mesa: Track transform feedback overflow query objects.
Also update checks on conditional rendering.
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Rafael Antognolli [Fri, 20 Jan 2017 17:53:22 +0000 (09:53 -0800)]
mesa: Add types for ARB_transform_feedback_oveflow_query.
Add some basic types and storage for the queries of this extension.
v2:
- update date of extension (Kenneth)
Signed-off-by: Rafael Antognolli <rafael.antognolli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Eric Engestrom [Tue, 21 Feb 2017 14:15:39 +0000 (14:15 +0000)]
gallium/docs: use imgmath instead of pngmath
WARNING: sphinx.ext.pngmath has been deprecated. Please use
sphinx.ext.imgmath instead.
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Eric Engestrom [Tue, 21 Feb 2017 14:15:38 +0000 (14:15 +0000)]
gallium/docs: fix section title formatting
src/gallium/docs/source/tgsi.rst:3488: WARNING: Title underline too short.
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Eric Engestrom [Tue, 21 Feb 2017 14:15:37 +0000 (14:15 +0000)]
gallium/docs: add missing newlines
Without these, mathjax considers these as the continuation of the
previous line.
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Eric Engestrom [Tue, 21 Feb 2017 14:15:36 +0000 (14:15 +0000)]
gallium/docs: add missing math formatting
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Eric Engestrom [Tue, 21 Feb 2017 14:15:35 +0000 (14:15 +0000)]
gallium/docs: fix sublist formatting
src/gallium/docs/source/context.rst:95: ERROR: Unexpected indentation.
Sub lists need to be surrounded by a blank line.
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Timothy Arceri [Tue, 21 Feb 2017 05:34:49 +0000 (16:34 +1100)]
util/disk_cache: create timestamp and gpu_id dirs when MESA_GLSL_CACHE_DIR is used
The make check test is also updated to make sure these dirs are created.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Timothy Arceri [Fri, 10 Feb 2017 02:02:22 +0000 (13:02 +1100)]
util/radv: move *_get_function_timestamp() to utils
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Kenneth Graunke [Fri, 17 Feb 2017 18:18:35 +0000 (10:18 -0800)]
docs: Update features.txt and relnotes for GL_ARB_transform_feedback2
Kenneth Graunke [Fri, 17 Feb 2017 04:05:39 +0000 (20:05 -0800)]
i965: Enable ARB_transform_feedback2 on Sandybridge.
The only feature over and above ES 3.0 is DrawTransformFeedback().
We already have to do the whole SOL_NUM_PRIMS_WRITTEN counter dance in
order to compute the SVBI value for ResumeTransformFeedback(), at which
point our existing GetTransformFeedbackVertexCount() implementation will
do the trick (though with a stall to CPU map the buffer).
Someday, we could probably implement DrawTransformFeedback() more
efficiently, using the "Load Internal Vertex Count" feature of
3DSTATE_SVB_INDEX and the 3DPRIMITIVE indirect vertex count bit.
Rumor has it this allows people to use WebGL 2.0 on Sandybridge.
Note that we don't need pipelined register writes like Gen7+ because
we use the 3DSTATE_SVB_INDEX command rather than MI_LOAD_REGISTER_MEM.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99842
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Kenneth Graunke [Fri, 17 Feb 2017 05:24:20 +0000 (21:24 -0800)]
i965: Properly reset SVBI counters on ResumeTransformFeedback().
This fixes Piglit's ARB_transform_feedback2/change-objects-while-paused
GLES 3.0 test. When resuming the transform feedback object, we need to
reset the SVBI counters so we continue writing at the correct point in
the buffer.
Instead of SO_WRITE_OFFSET counters (with a DWord offset), we have the
Streamed Vertex Buffer Index (SVBI) counters, which contain a count of
vertices emitted.
Unfortunately, there's no straightforward way to store the current SVBI
counter values to a buffer. They're not available in a register. You
can use a bit in the 3DSTATE_SVB_INDEX packet to copy them to another
internal counter which 3DPRIMITIVE can use...but there's no good way to
extract that either.
So, once again, we use SO_NUM_PRIMS_WRITTEN to calculate the vertex
numbers. Thankfully, we can reuse most of the existing Gen7+ code.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Kenneth Graunke [Fri, 17 Feb 2017 05:31:57 +0000 (21:31 -0800)]
i965: Save max_index in brw_transform_feedback_object.
I'm going to need this in a new Resume hook shortly.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Kenneth Graunke [Fri, 17 Feb 2017 05:17:44 +0000 (21:17 -0800)]
i965: Update brw_save_primitives_written_counters for pre-Gen7.
Sandybridge and earlier only have a single counter.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Kenneth Graunke [Fri, 17 Feb 2017 05:15:07 +0000 (21:15 -0800)]
i965: Use ctx->Const.MaxVertexStreams rather than BRW_XFB_MAX_STREAMS.
This way on Sandybridge we'll only do 1 stream worth of math, since
we only have one SO_NUM_PRIMS_WRITTEN counter.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Kenneth Graunke [Fri, 17 Feb 2017 05:06:34 +0000 (21:06 -0800)]
i965: Move some code from gen7_sol_state.c to gen6_sol.c.
I plan to use these functions on Sandybridge soon. I changed the prefix
on a couple of functions to "brw" instead of "gen7" as in theory they
should be usable all the way back to G45.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Kenneth Graunke [Fri, 17 Feb 2017 05:22:51 +0000 (21:22 -0800)]
i965: Drop dead Gen8+ code from Gen7/sometimes-HSW driver hooks.
These driver hooks are not used when MI_MATH and MI_LOAD_REGISTER_REG
are supported, which Gen8+ can always do. So this code is dead.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Marek Olšák [Mon, 20 Feb 2017 18:24:18 +0000 (19:24 +0100)]
vbo: kill primitive restart lowering in glDrawArrays
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Marek Olšák [Sat, 18 Feb 2017 16:08:34 +0000 (17:08 +0100)]
radeonsi: fix issues with monolithic shaders
R600_DEBUG=mono has had no effect since:
commit
1fabb297177069e95ec1bb7053acb32f8ec3e092
Author: Marek Olšák <marek.olsak@amd.com>
Date: Tue Feb 14 22:08:32 2017 +0100
radeonsi: have separate LS and ES main shader parts in the shader selector
Also, this assertion was failing:
si_state_shaders.c:1307: si_shader_select_with_key: Assertion
`!shader->is_optimized' failed.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Thu, 19 Jan 2017 13:36:17 +0000 (14:36 +0100)]
radeonsi: set no-signed-zeros-fp-math
Recommended by Matt Arsenault.
46757 shaders in 28742 tests
Totals:
SGPRS:
2068851 ->
2066907 (-0.09 %)
VGPRS:
1604056 ->
1602676 (-0.09 %)
Spilled SGPRs: 1402 -> 1382 (-1.43 %)
Spilled VGPRs: 113 -> 113 (0.00 %)
Private memory VGPRs: 1332 -> 1332 (0.00 %)
Scratch size: 3224 -> 3188 (-1.12 %) dwords per thread
Code Size:
58815520 ->
58716788 (-0.17 %) bytes
LDS: 1162 -> 1162 (0.00 %) blocks
Max Waves: 354616 -> 354905 (0.08 %)
Wait states: 0 -> 0 (0.00 %)
Totals from affected shaders:
SGPRS: 786452 -> 784508 (-0.25 %)
VGPRS: 530000 -> 528620 (-0.26 %)
Spilled SGPRs: 958 -> 938 (-2.09 %)
Spilled VGPRs: 85 -> 85 (0.00 %)
Private memory VGPRs: 636 -> 636 (0.00 %)
Scratch size: 1880 -> 1844 (-1.91 %) dwords per thread
Code Size:
26349936 ->
26251204 (-0.37 %) bytes
LDS: 304 -> 304 (0.00 %) blocks
Max Waves: 108962 -> 109251 (0.27 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sun, 29 Jan 2017 21:45:36 +0000 (22:45 +0100)]
gallivm: add no-signed-zeros-fp-math option to lp_create_builder (v2)
v2: define lp_float_mode
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sat, 18 Feb 2017 15:55:50 +0000 (16:55 +0100)]
radeonsi: skip TESSINNER/OUTER offchip stores if TES doesn't read them
We were unconditionally storing these outputs, sometimes even one component
at a time, but apps never read them in TES.
Move the TESSINNER/OUTER buffer stores into the TCS epilog where we can
easily disable them on demand.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sat, 18 Feb 2017 14:30:25 +0000 (15:30 +0100)]
radeonsi: skip LDS stores in TCS if there are no LDS output reads
This removes a lot of useless LDS stores.
A few games read TESSINNER/OUTER, but not any other outputs. Most games
don't read any outputs.
The only app doing LDS output reads is UE4 Lightsroom Interior.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Marek Olšák [Sat, 18 Feb 2017 14:26:42 +0000 (15:26 +0100)]
tgsi/scan: add basic info about tessellation OUT and IN uses
not all of them will be used immediately
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Jason Ekstrand [Mon, 20 Feb 2017 19:04:12 +0000 (11:04 -0800)]
anv: Take a device parameter in anv_state_flush
This allows the helper to check for llc instead of having to do it
manually at all the call sites.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 20 Feb 2017 18:37:34 +0000 (10:37 -0800)]
anv: Pull all clflushing into a clflush_range helper
All this cache line address calculation stuff is tricky. Let's not
duplicate it more places than we have to.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 20 Feb 2017 18:30:20 +0000 (10:30 -0800)]
anv: Remove the unused state_pool_emit macro
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 20 Feb 2017 18:28:27 +0000 (10:28 -0800)]
anv: Rename clflush_range and state_clflush
It's a bit shorter and easier to work with. Also, we're about to add a
helper called clflush which does the clflush but without any memory
fencing.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Mon, 20 Feb 2017 19:03:04 +0000 (11:03 -0800)]
intel/blorp: Explicitly flush all allocated state
Found by inspection. However, I expect it fixes real bugs when using
blorp from Vulkan on little-core platforms.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Jason Ekstrand [Sat, 18 Feb 2017 23:58:40 +0000 (15:58 -0800)]
anv: Put everything about queries in genX_query.c
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Sat, 18 Feb 2017 23:51:11 +0000 (15:51 -0800)]
anv/Makefile: alphabetize
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Jason Ekstrand [Sat, 18 Feb 2017 23:21:04 +0000 (15:21 -0800)]
anv/query: Perform CmdResetQueryPool on the GPU
This fixes a some rendering corruption in The Talos Principle
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Jason Ekstrand [Sat, 18 Feb 2017 23:18:31 +0000 (15:18 -0800)]
genxml: Make MI_STORE_DATA_IMM more consistent
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Jason Ekstrand [Sat, 18 Feb 2017 21:25:04 +0000 (13:25 -0800)]
anv/query: clflush the bo map on non-LLC platforms
Found by inspection
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Jason Ekstrand [Mon, 20 Feb 2017 18:18:57 +0000 (10:18 -0800)]
anv: Add an invalidate_range helper
This is similar to clflush_range except that it puts the mfence on the
other side to ensure caches are flushed prior to reading.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Christian Gmeiner [Wed, 8 Feb 2017 12:14:05 +0000 (13:14 +0100)]
etnaviv: remove number of pixel pipes validation
This validation was added before the etnaviv drm driver landed in
the linux kernel. Due some pre-merge API changes we had to fix-up
this value but with a mainline kernel this is not a problem anymore.
Lets remove that validation which also gets rid of problem caught
by Coverity, reported to me by imirkin.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Christian Gmeiner [Wed, 8 Feb 2017 12:07:25 +0000 (13:07 +0100)]
etnaviv: move pctx initialisation to avoid a null dereference
In case ctx->stream == NULL the fail label gets executed where
pctx gets dereferenced - too bad pctx is NULL in that case.
Caught by Coverity, reported to me by imirkin.
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Christian Gmeiner [Wed, 8 Feb 2017 12:03:19 +0000 (13:03 +0100)]
etnaviv: add missing fallthrough annotation
Caught by Coverity, reported to me by imirkin.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Emil Velikov [Mon, 20 Feb 2017 19:27:49 +0000 (19:27 +0000)]
docs/releasing.html: reword "distro breaking changes" hunk
v2: s/rare/rarely/ (Eric)
Suggested-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Nayan Deshmukh <nayan26deshmukh@gmail.com> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Emil Velikov [Sun, 19 Feb 2017 11:49:21 +0000 (11:49 +0000)]
radv: make radv_resolve_entrypoint static
Used only within the generated source file.
Fixes: 12301c54186 ("radv: drop the RADV_CALL macro.")
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>