Kenneth Graunke [Thu, 23 Mar 2017 22:29:43 +0000 (15:29 -0700)]
i965/drm: Drop libpciaccess dependencies.
i965 doesn't use drm_intel_get_aperture_sizes(), so we can delete
support for it. This avoids a build dependency on libpciaccess.
Chris also notes:
"There's a really old bug that hopefully has been closed already
(although as far as I can tell, it has never been fixed) about
how using libpciaccess from libdrm_intel breaks the world (since
libpciaccess uses a singleton that is torn down at the first request
rather than upon the last user)."
This bug should go away in two commits when we switch over to our
internal copy of libdrm_intel.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84325
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Tue, 21 Mar 2017 22:55:29 +0000 (15:55 -0700)]
i965/drm: Make libdrm_lists.h compile by defining typeof.
typeof doesn't seem to exist, so this won't compile (but we don't yet
try). Define it to __typeof__. This code is going to die soon anyway.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Fri, 31 Mar 2017 17:06:24 +0000 (10:06 -0700)]
i965/drm: remove legacy defines, aub functions, and decoder prototypes
We never imported any of this code, so drop the prototypes, unused
enums, and defines.
Based on patches by Emil Velikov.
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Mon, 20 Mar 2017 23:40:01 +0000 (16:40 -0700)]
i965: Import libdrm_intel.
This imports commit
19c4cfc54918d361f2535aec16650e9f0be667cd of
libdrm/intel/*.[ch], minus a few files that we're never going to use
(and would immediately delete), plus a few necessary dependencies.
We rename intel_bufmgr.h to brw_bufmgr.h to avoid #include conflicts.
We also fix UTF-8 symbol problems in intel_bufmgr_gem.c comments
because vim keeps trying to fix that every time I edit the file,
and we may as well fix it right away.
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Mon, 3 Apr 2017 07:03:01 +0000 (00:03 -0700)]
i965: Make sure we don't use CPU maps for the scanout buffer.
Using an incoherent CPU map on the active scanout buffer is really
sketchy - we may need extra flushing via GEM_SW_FINISH, or using
drmModeDirtyFB() and kernel commit
a6a7cc4b7db6d (4.10+).
Chris suggests "never ever do that", which seems like a wise plan!
intel_miptree_map_raw() uses CPU maps on linear buffers.
Having a linear scanout buffer should be really rare, and mapping the
front buffer should be similarly rare. Together, it should basically
never happen. But, in case it does somehow...make sure that mapping
the scanout buffer always goes through an uncached GTT map.
v2: Add a giant comment written by Chris Wilson.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Kenneth Graunke [Wed, 22 Mar 2017 00:08:20 +0000 (17:08 -0700)]
i965: Stop calling drm_intel_bufmgr_gem_enable_fenced_relocs().
This does nothing on Gen4+, which is the only hardware we support.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Kenneth Graunke [Thu, 30 Mar 2017 20:52:46 +0000 (13:52 -0700)]
i965: Fix GLX_MESA_query_renderer video memory on 32-bit.
On modern systems with 4GB apertures, the size in bytes is
4294967296,
or (1ull << 32). The kernel gives us the aperture size as a __u64,
which works out great.
Unfortunately, libdrm "helpfully" returns the data as a size_t, which
on 32-bit systems means it truncates the aperture size to 0 bytes.
We've happily reported this value as 0 MB of video memory via
GLX_MESA_query_renderer since it was originally exposed.
This patch bypasses libdrm and calls the ioctl ourselves so we can
use a proper uint64_t, avoiding the 32-bit integer overflow. We now
report a proper video memory size on 32-bit systems.
Chris points out that the aperture size (CPU mappable size limit)
isn't really the right thing to be checking. But libdrm_intel uses
it to fail execbuffer, so it is an actual limit for now. Once that's
fixed we can probably move to something else. In the meantime, fix
the obvious typecasting bug.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Samuel Pitoiset [Mon, 10 Apr 2017 09:49:05 +0000 (11:49 +0200)]
gallium/radeon: add HUD queries for GPU temperature and clocks
Only the Radeon kernel driver exposed the GPU temperature and
the shader/memory clocks, this implements the same functionality
for the AMDGPU kernel driver.
These queries will return 0 if the DRM version is less than 3.10,
I don't explicitely check the version here because the query
codepath is already a bit messy.
v2: - rebase on top of master
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Mon, 10 Apr 2017 09:49:04 +0000 (11:49 +0200)]
configure.ac: require libdrm_amdgpu 2.4.79
The sensor info requires amdgpu_query_sensor_info().
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Wed, 5 Apr 2017 22:07:35 +0000 (00:07 +0200)]
radeonsi: add new si_check_render_feedback_texture() helper
For bindless.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Wed, 5 Apr 2017 22:07:34 +0000 (00:07 +0200)]
radeonsi: add new si_decompress_color_texture() helper
For bindless.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Wed, 5 Apr 2017 22:07:33 +0000 (00:07 +0200)]
radeonsi: add new depth_needs_decompression() helper
v2: - rename to depth_needs_decompression() instead
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Wed, 5 Apr 2017 22:07:32 +0000 (00:07 +0200)]
radeonsi: add a 'break' in si_check_render_feedback_*()
No need to check all color buffers.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Wed, 5 Apr 2017 22:07:31 +0000 (00:07 +0200)]
radeonsi: re-use 'desc' in si_set_shader_image()
No need to compute the offset in the descriptor twice.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Samuel Pitoiset [Fri, 7 Apr 2017 16:44:16 +0000 (18:44 +0200)]
ac: add unreachable() in ac_build_image_opcode()
To silent the following compiler warning:
common/ac_llvm_build.c: In function ‘ac_build_image_opcode’:
common/ac_llvm_build.c:1080:3: warning: ‘name’ may be used uninitialized in this function [-Wmaybe-uninitialized]
snprintf(intr_name, sizeof(intr_name), "%s%s%s%s.v4f32.%s.v8i32",
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
name,
~~~~~
a->compare ? ".c" : "",
~~~~~~~~~~~~~~~~~~~~~~~
a->bias ? ".b" :
~~~~~~~~~~~~~~~~
a->lod ? ".l" :
~~~~~~~~~~~~~~~
a->deriv ? ".d" :
~~~~~~~~~~~~~~~~~
a->level_zero ? ".lz" : "",
~~~~~~~~~~~~~~~~~~~~~~~~~~~
a->offset ? ".o" : "",
~~~~~~~~~~~~~~~~~~~~~~
type);
~~~~~
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Constantine Kharlamov [Mon, 10 Apr 2017 20:04:37 +0000 (23:04 +0300)]
r600g: get rid of dummy pixel shader
The idea is taken from radeonsi. The code mostly was already checking for null
pixel shader, so little checks had to be added.
Interestingly, acc. to testing with GTAⅣ, though binding of null shader happens
a lot at the start (then just stops), but draw_vbo() never actually sees null
ps.
v2: added a check I missed because of a macros using a prefix to choose
a shader.
Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Constantine Kharlamov [Mon, 10 Apr 2017 20:04:36 +0000 (23:04 +0300)]
r600g: add draw_vbo check for a NULL pixel shader
Taken from radeonsi, required to remove dummy pixel shader in the next patch
Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Constantine Kharlamov [Mon, 10 Apr 2017 20:04:35 +0000 (23:04 +0300)]
r600g: skip repeating vs, gs, and tes shader binds
The idea is taken from radeonsi. The code lacks some checks for null vs,
and I'm unsure about some changes against that, so I left it in place.
Some statistics for GTAⅣ:
Average tesselation bind skip per frame: ≈350
Average geometric shaders bind skip per frame: ≈260
Skip of binding vertex ones occurs rarely enough to not get into per-frame
counter at all, so I just gonna say: it happens.
v2: I've occasionally removed an empty line, don't do this.
v3: return a check for null tes and gs back, while I haven't figured out
the way to move stride assignment to r600_update_derived_state() (as it
is in radeonsi).
Signed-off-by: Constantine Kharlamov <Hi-Angel@yandex.ru>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Bartosz Tomczyk [Mon, 10 Apr 2017 18:31:00 +0000 (12:31 -0600)]
mesa: use single memcpy when strides match in glReadPixels, texstore code
v2: fix indentation
Reviewed-by: Brian Paul <brianp@vmware.com>
Jason Ekstrand [Thu, 30 Mar 2017 06:00:16 +0000 (23:00 -0700)]
intel/blorp: Use ISL for emitting depth/stencil/hiz
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Jason Ekstrand [Wed, 5 Apr 2017 23:59:06 +0000 (16:59 -0700)]
intel/blorp: Emit 3DSTATE_STENCIL_BUFFER before HIER_DEPTH
We're about to replace blorp's emit code with ISL and it emits them in
the other order. This makes diffing the aubs easier.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Jason Ekstrand [Thu, 30 Mar 2017 05:24:50 +0000 (22:24 -0700)]
anv: Use ISL for emitting depth/stencil/hiz
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Jason Ekstrand [Thu, 30 Mar 2017 04:10:50 +0000 (21:10 -0700)]
intel/isl: Add support for emitting depth/stencil/hiz
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Thomas Hindoe Paaboel Andersen [Sat, 8 Apr 2017 06:36:36 +0000 (08:36 +0200)]
amd/addrlib: use correct variable name in header
Since the inclusion in
7f160efcde41b52ad78e562316384373dab419e3
the header used x_biased, while the implementation used y_biased.
This changes the header to macth the implementation since the
uses of the function seems to expect y_biased.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Timothy Arceri [Sat, 8 Apr 2017 00:54:56 +0000 (10:54 +1000)]
mesa/st: take ownership rather than adding reference for new renderbuffers
This avoids locking in the reference calls and fixes a leak after the
RefCount initialisation was change from 0 to 1.
Fixes: 32141e53d1520 (mesa: tidy up renderbuffer RefCount initialisation)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Bartosz Tomczyk <bartosz.tomczyk86@gmail.com>
Timothy Arceri [Sat, 8 Apr 2017 00:47:12 +0000 (10:47 +1000)]
x11: take ownership rather than adding reference for new renderbuffers
This avoids locking in the reference calls and fixes a leak after the
RefCount initialisation was change from 0 to 1.
Fixes: 32141e53d1520 (mesa: tidy up renderbuffer RefCount initialisation)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Timothy Arceri [Sat, 8 Apr 2017 00:42:57 +0000 (10:42 +1000)]
osmesa: tidy up renderbuffer refCount initialisation
32141e53d1520 changed _mesa_init_renderbuffer() to set it to 1 for
us.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Timothy Arceri [Sat, 8 Apr 2017 00:35:57 +0000 (10:35 +1000)]
swrast: take ownership rather than adding reference for new renderbuffers
This avoids locking in the reference calls and fixes a leak after the
RefCount initialisation was change from 0 to 1.
Fixes: 32141e53d1520 (mesa: tidy up renderbuffer RefCount initialisation)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Timothy Arceri [Sat, 8 Apr 2017 00:29:22 +0000 (10:29 +1000)]
radeon: take ownership rather than adding reference for new renderbuffers
This avoids locking in the reference calls and fixes a leak after the
RefCount initialisation was change from 0 to 1.
Fixes: 32141e53d1520 (mesa: tidy up renderbuffer RefCount initialisation)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Timothy Arceri [Sat, 8 Apr 2017 00:26:34 +0000 (10:26 +1000)]
nouveau: take ownership rather than adding reference for new renderbuffers
This avoids locking in the reference calls and fixes a leak after the
RefCount initialisation was change from 0 to 1.
Fixes: 32141e53d1520 (mesa: tidy up renderbuffer RefCount initialisation)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Timothy Arceri [Sat, 8 Apr 2017 00:22:16 +0000 (10:22 +1000)]
i965: take ownership rather than adding reference for new renderbuffers
This avoids locking in the reference calls and fixes a leak after the
RefCount initialisation was change from 0 to 1.
Fixes: 32141e53d1520 (mesa: tidy up renderbuffer RefCount initialisation)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Timothy Arceri [Sat, 8 Apr 2017 00:13:24 +0000 (10:13 +1000)]
i915: take ownership rather than adding reference for new renderbuffers
This avoids locking in the reference calls and fixes a leak after the
RefCount initialisation was change from 0 to 1.
Fixes: 32141e53d1520 (mesa: tidy up renderbuffer RefCount initialisation)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Timothy Arceri [Sat, 8 Apr 2017 00:03:20 +0000 (10:03 +1000)]
mesa: create _mesa_attach_renderbuffer_without_ref() helper
This will be used to take ownership of freashly created renderbuffers,
avoiding the need to call the reference function which requires
locking.
V2: dereference any existing fb attachments and actually attach the
new rb.
v3: split out validation and attachment type/complete setting into
a shared static function.
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Tested-by: Bartosz Tomczyk <bartosz.tomczyk86@gmail.com>
Ilia Mirkin [Sun, 9 Apr 2017 18:56:59 +0000 (14:56 -0400)]
nv50/ir: remove unused swizzle field in ValueRef
The nv50 ir is scalar. Perhaps this was from some early attempts to
integrate the simd aspects of nv30. However at this point it's entirely
unused.
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Boyan Ding [Tue, 4 Apr 2017 14:44:47 +0000 (22:44 +0800)]
nouveau: enable ARB_shader_clock on nv50 and nvc0
v2: Also enable support on nv50
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Boyan Ding [Tue, 4 Apr 2017 14:44:46 +0000 (22:44 +0800)]
nv50/ir: Handle TGSI_OPCODE_CLOCK
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
[imirkin: make zero mov non-fixed]
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Boyan Ding [Fri, 31 Mar 2017 02:33:05 +0000 (10:33 +0800)]
gm107/ir: Emit SV_CLOCK system value
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ben Widawsky [Wed, 5 Apr 2017 17:31:45 +0000 (10:31 -0700)]
gbm: Assert modifiers and count are copacetic
The API/entry point in mesa already checks the correct behavior,
however, it's possible to be handled by another implementation and those
implementations should not be able to abuse a weird combination of count
and pointer.
This fixes CID
1403193
Cc: Mark Janes <mark.a.janes@intel.com>
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Gustaw Smolarczyk [Thu, 30 Mar 2017 18:09:33 +0000 (20:09 +0200)]
st/mesa: Use compressed fog mode for atifs.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Gustaw Smolarczyk [Thu, 30 Mar 2017 18:09:32 +0000 (20:09 +0200)]
mesa/main/ff_frag: Use compressed TexEnv Combine state.
Along the way, add missing GL_ONE source support and drop non-existing
GL_ZERO and GL_ONE operand support.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Gustaw Smolarczyk [Thu, 30 Mar 2017 18:09:31 +0000 (20:09 +0200)]
mesa/main/ff_frag: Use compressed fog mode.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Gustaw Smolarczyk [Thu, 30 Mar 2017 18:09:30 +0000 (20:09 +0200)]
mesa/main: Maintain compressed TexEnv Combine state.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Gustaw Smolarczyk [Thu, 30 Mar 2017 18:09:29 +0000 (20:09 +0200)]
mesa/main: Maintain compressed fog mode.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Gustaw Smolarczyk [Thu, 30 Mar 2017 18:09:28 +0000 (20:09 +0200)]
mesa/main/ff_frag: Don't retrieve format if not necessary.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Gustaw Smolarczyk [Thu, 30 Mar 2017 18:09:27 +0000 (20:09 +0200)]
mesa/main/ff_frag: Use gl_texture_object::TargetIndex.
Instead of computing it once again using _mesa_tex_target_to_index.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Gustaw Smolarczyk [Thu, 30 Mar 2017 18:09:26 +0000 (20:09 +0200)]
mesa/main/ff_frag: Store nr_enabled_units only once.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Gustaw Smolarczyk [Thu, 30 Mar 2017 18:09:25 +0000 (20:09 +0200)]
mesa/main/ff_frag: Simplify get_fp_input_mask.
Change it into filter_fp_input_mask transform function that instead of
returning a mask, transforms input.
Also, simplify the case of vertex program handling by assuming that
fp_inputs is always a combination of VARYING_BIT_COL* and VARYING_BIT_TEX*.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Gustaw Smolarczyk [Thu, 30 Mar 2017 18:09:24 +0000 (20:09 +0200)]
mesa/main/ff_frag: Don't bother with VARYING_BIT_FOGC.
It's not used.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Gustaw Smolarczyk [Thu, 30 Mar 2017 18:09:23 +0000 (20:09 +0200)]
mesa/main/ff_frag: Remove unused struct.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Gustaw Smolarczyk [Thu, 30 Mar 2017 18:09:22 +0000 (20:09 +0200)]
mesa/main/ff_frag: Reduce the size of nr_enabled_units.
Since it holds values from 0 to 8, 4 bits will suffice.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Gustaw Smolarczyk [Thu, 30 Mar 2017 18:09:21 +0000 (20:09 +0200)]
mesa/main/ff_frag: Remove enabled_units.
Its only usage is easily replaced by nr_enabled_units. As for cache key
part, unit[i].enabled should be enough.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Gustaw Smolarczyk [Thu, 30 Mar 2017 18:09:20 +0000 (20:09 +0200)]
mesa/main/ff_frag: Use correct constant.
Since fixed-function shaders are restricted to MAX_TEXTURE_COORD_UNITS
texture units, use this constant instead of MAX_TEXTURE_UNITS. This
reduces the array size from 32 to 8.
Signed-off-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
Jason Ekstrand [Thu, 30 Mar 2017 04:01:48 +0000 (21:01 -0700)]
intel/isl: Use genx_bits.h instead of a hand-rolled table
This gets rid of one piece of ugliness with the way ISL handles surface
emitting surface states. I've never liked that hand-rolled table but it
was the best we had at the time.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Jason Ekstrand [Thu, 30 Mar 2017 03:39:18 +0000 (20:39 -0700)]
intel/genxml/bits: Emit per-container _length helpers
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Jason Ekstrand [Thu, 30 Mar 2017 03:40:49 +0000 (20:40 -0700)]
intel/genxml/bits: Emit per-field _start helpers
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Jason Ekstrand [Thu, 30 Mar 2017 03:21:06 +0000 (20:21 -0700)]
intel/genxml/bits: Pull the function emit code into a helper block
The helper block is extremely general. It takes an string property name
and an object that supports three methods: has_prop, iter_prop, and
get_prop. This way we can easily generalize it to emit more different
types of getter functions.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Jason Ekstrand [Thu, 30 Mar 2017 01:39:13 +0000 (18:39 -0700)]
intel/genxml/bits: Refactor to add a container class
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Ilia Mirkin [Sat, 8 Apr 2017 03:23:25 +0000 (23:23 -0400)]
nvc0/ir: fix overwriting of offset register with interpolateAtOffset
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Jason Ekstrand [Sat, 11 Mar 2017 01:50:01 +0000 (17:50 -0800)]
anv: Use subpass dependencies for flushes
Instead of figuring it all out ourselves, just use the information given
to us by the client.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Sat, 11 Mar 2017 01:29:24 +0000 (17:29 -0800)]
anv/pass: Record required pipe flushes
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Fri, 7 Apr 2017 05:18:03 +0000 (22:18 -0700)]
anv/pass: Use anv_multialloc for allocating the anv_pass
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Fri, 7 Apr 2017 05:43:40 +0000 (22:43 -0700)]
anv/descriptor_set: Use anv_multialloc for descriptor set layouts
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Fri, 7 Apr 2017 05:15:16 +0000 (22:15 -0700)]
anv: Add a helper for doing mass allocations
We tend to try to reduce the number of allocation calls the Vulkan
driver uses by doing a single allocation whenever possible for a data
structure. While this has certain downsides (usually code complexity),
it does mean error handling and cleanup is much easier. This commit
adds a nice little helper struct for getting rid of some of that
complexity.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Jason Ekstrand [Sat, 11 Mar 2017 00:51:07 +0000 (16:51 -0800)]
anv: Add helpers for converting access flags to pipe bits
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Timothy Arceri [Thu, 6 Apr 2017 04:47:34 +0000 (14:47 +1000)]
mesa: simplify and optimise vertex bindings tracking
We only need to update it if something changes. Also
_mesa_bind_vertex_buffer() will update the mask when binding to a
NULL or default buffer so no need to do that update here.
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Timothy Arceri [Fri, 7 Apr 2017 01:24:37 +0000 (11:24 +1000)]
glsl: fix lower jumps for nested non-void returns
Fixes the case were a loop contains a return and the loop is
nested inside an if.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
https://bugs.freedesktop.org/show_bug.cgi?id=100303
Ilia Mirkin [Thu, 9 Feb 2017 23:04:08 +0000 (18:04 -0500)]
gallium: fix some math formulas to display better
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Ilia Mirkin [Sat, 8 Apr 2017 00:17:47 +0000 (20:17 -0400)]
nvc0/ir: fix LSB/BFE/BFI implementations
Overwriting the src register is a very bad idea - it logically maps onto
the TGSI registers, and so is effectively overwriting the source values.
Reported-by: Boyan Ding <boyan.j.ding@gmail.com>
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Nicolai Hähnle [Fri, 7 Apr 2017 10:23:11 +0000 (12:23 +0200)]
util: fix swizzle of INSTANCEID system value
radeonsi added stricter checking for correct swizzles in debug builds.
Reported-by: Michel Dänzer <michel.daenzer@amd.com>
Fixes: 4cf29427770f ("radeonsi: support 64-bit system values")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bruce Cherniak [Fri, 7 Apr 2017 16:27:39 +0000 (11:27 -0500)]
st/glx: Add awareness for multisample pixel formats to st/glx-xlib.
In preparation for enabling MSAA in OpenSWR, the state trackers need to
be aware of multisample pixel formats for software renderers. This patch
allows glx-xlib to query the renderer for support of pixel
formats with multisample, and create multisample resources.
This change is benign to softpipe and llvmpipe, as is_format_supported
returns FALSE for any sample_count > 1. OpenSWR does the same at the
moment, but that will change soon.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tim Rowley [Fri, 7 Apr 2017 00:09:23 +0000 (19:09 -0500)]
swr: fix unused variable warnings
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
Brian Paul [Wed, 5 Apr 2017 20:07:57 +0000 (14:07 -0600)]
glx: silence uninitialized var warning
Signed-off-by: Brian Paul <brianp@vmware.com>
Brian Paul [Wed, 5 Apr 2017 20:07:45 +0000 (14:07 -0600)]
st/mesa: silence unused/uninitialized var warnings
Signed-off-by: Brian Paul <brianp@vmware.com>
Brian Paul [Wed, 5 Apr 2017 19:53:41 +0000 (13:53 -0600)]
gallivm: init vars to silence gcc warnings
Silence warnings about using possibly uninitialized values.
Signed-off-by: Brian Paul <brianp@vmware.com>
Charmaine Lee [Fri, 27 Jan 2017 02:46:23 +0000 (18:46 -0800)]
svga: add context pointer to the invalidate surface interface
With this patch, we will specify the current context
when we invalidate the surface before the surface is
put back to the recycled surface pool. This allows the
winsys layer to use the specified context to do the
invalidation rather than using the last context that
referenced the surface. This prevents race condition if
the last referenced context is now made current in another thread.
Tested with MTT glretrace, NobelClinicianViewer.
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Brian Paul [Wed, 5 Apr 2017 21:15:27 +0000 (15:15 -0600)]
winsys/svga: use c11 thread types/functions
Gallium no longer has wrappers for mutexes and condition variables.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Thomas Hellstrom [Thu, 15 Sep 2016 11:18:13 +0000 (13:18 +0200)]
winsys/svga: Resolve command submission buffer contention v3
If two contexts wanted to access the same buffer at the same time, it would
end up on two validation lists simultaneously, which might cause a
PIPE_ERROR_RETRY when trying to validate it from one context while the other
context already had it validated but not yet fenced.
In that situation we could spin until the error goes away, or apply various
more or less expensive locking schemes to save cpu.
Here we use a scheme that briefly locks after fencing but avoids locking on
validation in the non-contended case.
v2:
Make sure we broadcast not only on releasing buffers after fencing, but also
after releasing buffers in the pb_validate_validate error path.
v3:
Don't broadcast on PIPE_ERROR_RETRY because that would increase the chance
of starvation.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Brian Paul [Tue, 4 Apr 2017 19:48:46 +0000 (13:48 -0600)]
svga: remove pre-SVGA3D_HWVERSION_WS8_B1 code
3D wasn't officially supported before virtual HW version 8 so we can
remove this old code.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Brian Paul [Tue, 4 Apr 2017 19:21:25 +0000 (13:21 -0600)]
st/wgl: sort strings in stw_extension_string[] array
Trivial.
Charmaine Lee [Mon, 3 Apr 2017 19:57:18 +0000 (12:57 -0700)]
svga: remove redundant surface propagation
Currently, surface propagation for colliding render target resource is
done at framebuffer emit time for vgpu10. This patch
adds the surface propagation for non-vgpu10 path to emit_fb_vgpu9()
and removes the redundant surface copy at set time.
Tested with MTT glretrace, piglit, NobelClinicianViewer, Turbine, Cinebench.
Reviewed-by: Neha Bhende <bhenden@vmware.com>
Charmaine Lee [Tue, 4 Apr 2017 19:14:54 +0000 (13:14 -0600)]
svga: Fix zslice index to svga_texture_copy_handle_resource()
The zslice index to svga_texture_copy_handle_resource() is not adjusted
and should be a signed integer.
This patch fixes piglit tests for non-vgpu10 including
spec@arb_framebuffer_object@fbo-generatemipmap-3d
spec@glsl-1.20@execution@tex-miplevel-selection gl2:texture* 3d
Tested with MTT piglit and glretrace
Brian Paul [Tue, 4 Apr 2017 19:13:33 +0000 (13:13 -0600)]
svga: specify include path for git_sha1.h for out-of-src builds
If we're doing an out-of-src build, we need to specify the #include
patch to find git_sha1.h
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Brian Paul [Tue, 4 Apr 2017 19:11:44 +0000 (13:11 -0600)]
st/wgl: pseudo-implementation of WGL_EXT_swap_control
This implementation is based on querying the time just before swap/present
and doing a Sleep() if needed. There is no sync to vblank or actual
coordination with the GPU. This isn't perfect, but basically works.
We've had some request for this functionality, and it sounds like there
are some Windows GL apps that refuse to start if the driver doesn't
advertise this extension.
Note: NVIDIA's Windows OpenGL driver advertises the WGL_EXT_swap_control
string both with wglGetExtensionsStringEXT() and with
glGetString(GL_EXTENSIONS). We're only advertising it with the former at
this time.
Tested with asst. Mesa demos, Google Earth, Lightsmark, etc.
VMware bug
1591534.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Charmaine Lee [Tue, 4 Apr 2017 19:02:45 +0000 (13:02 -0600)]
svga: Fix out-of-sync backing surface
When a backing surface is reused, it is possible that
the original surface has been changed. So before the backing surface
is bound again, we need to sync up the surface.
This patch creates a new helper function svga_texture_copy_handle_resource()
to sync up the backing surface resource.
This patch, together with the backing surface dirty bit fix, fixes
the rendering corruption in NobelClinicianViewer when rotating the model.
Also tested with MTT glretrace, piglit, Cinebench, Turbine.
Reviewed-by: Brian Paul <brianp@vmware.com>
Charmaine Lee [Wed, 22 Mar 2017 19:45:11 +0000 (12:45 -0700)]
svga: add a reset flag to svga_propagate_surface()
The reset flag specifies if the dirty bit needs to be reset
after the surface is propagated to the texture. This is used
to make sure that the dirty bit is not reset and stay unset
before the surface is unbound.
Reviewed-by: Brian Paul <brianp@vmware.com>
Charmaine Lee [Wed, 22 Mar 2017 17:46:54 +0000 (10:46 -0700)]
svga: add the has_backed_views flag
The new has_backed_views flag specifies if any of the render target
views or depth stencil view is a backing surface view.
The flag is used in svga_propagate_rendertargets() so it can return early
if there is no surface to propagate.
Reviewed-by: Brian Paul <brianp@vmware.com>
Charmaine Lee [Thu, 3 Nov 2016 17:35:55 +0000 (10:35 -0700)]
svga: only destroy render target view from a context that created it
A texture can be destroyed from a different context from which it is
created, but destroying the render target view from a different context
will cause svga device errors. Similar to shader resource view,
this patch skips destroying render target view or depth stencil view
from a non-parent context.
Fixes driver errors running NobelClinician Viewer application.
Tested with NobelClinician Viewer, MTT piglit, glretrace.
Reviewed-by: Brian Paul <brianp@vmware.com>
Charmaine Lee [Tue, 4 Apr 2017 18:47:49 +0000 (12:47 -0600)]
svga: disable rasterization if rasterizer_discard is set or FS undefined
With this patch, rasterization will be disabled if the
rasterizer_discard flag is set or the fragment shader
is undefined due to missing position output from the
vertex/geometry shader.
Tested with piglit test glsl-1.50-geometry-primitive-id-restart.
Also tested with full MTT glretrace and piglit.
v2: As suggested by Roland, to properly disable rasterization, besides
setting FS to NULL, we will also need to disable depth and stencil test.
v3: As suggested by Brian, set SVGA_NEW_DEPTH_STENCIL_ALPHA dirty bit
in svga_bind_rasterizer_state() if the rasterizer_discard flag is
changed.
Reviewed-by: Brian Paul <brianp@vmware.com>
Charmaine Lee [Wed, 15 Mar 2017 22:18:14 +0000 (15:18 -0700)]
svga: do not emulate wide points in GS when doing transform feedback
Emulating wide points in geometry shader when doing transform feedback
is problematic. This patch disables the emulation.
Tested with piglit test ext_transform_feedback-points.
Also tested with MTT glretrace, mesa demos pointblast and spriteblast.
Reviewed-by: Brian Paul <brianp@vmware.com>
Jason Ekstrand [Thu, 6 Apr 2017 20:34:38 +0000 (13:34 -0700)]
anv/query: Use snooping on !LLC platforms
Commit
b2c97bc789198427043cd902bc76e194e7e81c7d which made us start
using a busy-wait for individual query results also messed up cache
flushing on !LLC platforms. For one thing, I forgot the mfence after
the clflush so memory access wasn't properly getting fenced. More
importantly, however, was that we were clflushing the whole query range
and then waiting for individual queries and then trying to read the
results without clflushing again. Getting the clflushing both correct
and efficient is very subtle and painful. Instead, let's side-step the
problem by just snooping.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Emil Velikov [Thu, 6 Apr 2017 12:01:26 +0000 (13:01 +0100)]
anv: provide anv_gem_busy() stub for the tests
Otherwise linking way fail.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100600
Fixes: f195d40eca4 ("anv/device: Add a helper for querying whether a BO is busy")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
Rob Clark [Tue, 4 Apr 2017 17:01:56 +0000 (13:01 -0400)]
gallium/util: tweak backtrace format with libunwind
To work with addr2line.sh we also need the relative offset within the
DSO. And addr2line.sh gets confused by the leading stackframe number.
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Tue, 4 Apr 2017 13:52:57 +0000 (09:52 -0400)]
gallium/util: cache symbol lookup with libunwind
Signed-off-by: Rob Clark <robdclark@gmail.com>
Rob Clark [Tue, 4 Apr 2017 12:53:57 +0000 (08:53 -0400)]
gallium/util: fix missing limit check in libunwind backtrace
Fixes: 70c272004f ("gallium/util: libunwind support")
Signed-off-by: Rob Clark <robdclark@gmail.com>
Timothy Arceri [Thu, 6 Apr 2017 21:55:17 +0000 (07:55 +1000)]
mesa: fix renderbuffer leak
We don't need to call _mesa_reference_renderbuffer() for the first
assignment as refCount starts at 1. For swrast we work around the
fact we will indirectly call _mesa_reference_renderbuffer() by
resetting refCount to 0.
Fixes: 32141e53d1520 (mesa: tidy up renderbuffer RefCount initialisation)
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Samuel Iglesias Gonsálvez [Tue, 21 Mar 2017 06:16:27 +0000 (07:16 +0100)]
anv/blorp: sample input attachments with resolves on BDW
On Broadwell we still need to do a resolve between the subpass
that writes and the subpass that reads when there is a
self-dependency because HW could not see fast-clears and works
on the render cache as if there was regular non-fast-clear surface.
Fixes 16 tests on BDW:
dEQP-VK.renderpass.formats.*.input.clear.store.self_dep*
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Fredrik Höglund [Sat, 1 Apr 2017 13:03:09 +0000 (15:03 +0200)]
radv: don't call radeon_check_space in radv_BindDescriptorSets
This appears to be a leftover from an earlier version of this function.
Nothing is emitted into the CS.
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fredrik Höglund [Wed, 29 Mar 2017 17:19:47 +0000 (19:19 +0200)]
radv: implement VK_KHR_descriptor_update_template
All offsets and strides are precomputed by
radv_CreateDescriptorUpdateTemplateKHR and stored in the template.
v2: Move the new struct declarations from radv_descriptor_set.h
to radv_private.h (Bas)
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fredrik Höglund [Wed, 29 Mar 2017 16:12:44 +0000 (18:12 +0200)]
radv: implement VK_KHR_push_descriptor
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fredrik Höglund [Wed, 29 Mar 2017 16:11:56 +0000 (18:11 +0200)]
radv: replace an assertion with a conditional
Replace the !binding_layout->immutable_samplers assertion in
radv_update_descriptor_sets with a conditional.
The Vulkan specification does not say that it is illegal to update
a sampler descriptor when it is immutable; only that pImageInfo is
ignored.
This change is also needed for push descriptors, because valid
descriptors must be pushed for all bindings accessed by shaders,
including immutable sampler descriptors.
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>