git.libre-soc.org Git - mesa.git/log

projects / mesa.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Emil Velikov [Wed, 27 Sep 2017 16:36:29 +0000 (17:36 +0100)]

egl/dri: link directly to libglapi.so

Shared glapi (libglapi.so) has been a requirement for years, in order
to build EGL.

Remove the no longer necessary dlopen/dlsym dance and link to the
library directly.

This allows us to remove a handful of platform specific workarounds, due
to the different name of the library.

v2:
- Android: export the include dir (RobH)
- Drop unused local variable (Eric)

Cc: Jonathan Gray <jsg@jsg.id.au>
Cc: Jon Turney <jon.turney@dronecode.org.uk>
Cc: Julien Isorce <julien.isorce@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Tested-by: Tomasz Figa <tfiga@chromium.org> (v1)
Tested-by: Rob Herring <robh@kernel.org>

commit | commitdiff | tree

Emil Velikov [Mon, 18 Sep 2017 10:29:21 +0000 (11:29 +0100)]

swr/rast: do not crash on NULL strings returned by getenv

The current convenience function GetEnv feeds the results of getenv
directly into std::string(). That is a bad idea, since the variable
may be unset, thus we feed NULL into the C++ construct.

The latter of which is not allowed and leads to a crash.

v2: Better variable name, implicit char* -> std::string conversion (Eric)

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101832
Fixes: a25093de718 ("swr/rast: Implement JIT shader caching to disk")
Cc: Tim Rowley <timothy.o.rowley@intel.com>
Cc: Laurent Carlier <lordheavym@gmail.com>
Cc: Bernhard Rosenkraenzer <bero@lindev.ch>
[Emil Velikov: make an actual commit from the misc diff]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com> (v1)
Reviewed-by: Laurent Carlier <lordheavym@gmail.com> (v1)

commit | commitdiff | tree

Rob Clark [Thu, 14 Sep 2017 19:03:11 +0000 (15:03 -0400)]

freedreno/a5xx: align height to GMEM

Similar to the way width/pitch alignment works, it seems like we need to
do similar for height.  Otherwise the BLIT from system memory to GMEM
can over-fetch beyond the end of the buffer, triggering a fault.

I'm not sure if there is a better solution yet.  Possibly we could fall
back to pre-a5xx style DRAW packets for cases where BLIT might over-
fetch.  (We in theory have that problem already with rendering to higher
mipmap levels, although fortunately those tend to use GMEM bypass.)

This fixes issues reported with glamor.

Reported-by: don.harbin@linaro.org
Cc: 17.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Nicolai Hähnle [Tue, 26 Sep 2017 18:36:10 +0000 (20:36 +0200)]

radeonsi: adjust clip discard based on line width / point size

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Tue, 26 Sep 2017 16:10:58 +0000 (18:10 +0200)]

radeonsi: remove si_context::{scissor_enabled,clip_halfz}

They are just copies of the rasterizer state.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Tue, 26 Sep 2017 16:00:21 +0000 (18:00 +0200)]

radeonsi: simplify the signature of si_update_vs_writes_viewport_index

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Tue, 26 Sep 2017 15:57:59 +0000 (17:57 +0200)]

radeonsi: move current_rast_prim into si_context

v2: rebase fixes

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Tue, 26 Sep 2017 15:56:15 +0000 (17:56 +0200)]

radeonsi: move and rename scissor and viewport state and functions

v2: change GET_MAX_SCISSOR to SI_MAX_SCISSOR

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Tue, 26 Sep 2017 15:24:19 +0000 (17:24 +0200)]

radeonsi: remove si_apply_scissor_bug_workaround

It only affects pre-SI chips.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Tue, 26 Sep 2017 15:17:55 +0000 (17:17 +0200)]

radeonsi: move r600_viewport.c to si_viewport.c

This is purely a file-move + #include fixup + build system changes.
Other cleanups will follow in subsequent commits.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Sun, 17 Sep 2017 09:59:37 +0000 (11:59 +0200)]

radeonsi: fix maximum advertised point size / line width

The hardware registers store the half-size/width in 12.4 fixed point
format, so 8192 is the maximum.

Fixes dEQP-GLES3.functional.rasterization.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Sun, 17 Sep 2017 09:28:21 +0000 (11:28 +0200)]

radeonsi: deduce rast_prim correctly for tessellation point mode

Together with the previous patches, this fixes
dEQP-GLES31.functional.primitive_bounding_box.wide_points.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Sun, 17 Sep 2017 09:26:53 +0000 (11:26 +0200)]

radeonsi: don't discard points and lines

This is a bit conservative, but a more precise solution requires access
to the rasterizer state. This is something to tackle after the fork between
r600 and radeonsi.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Sun, 17 Sep 2017 09:10:04 +0000 (11:10 +0200)]

radeonsi: move current_rast_prim to r600_common_context

We'll use it in the scissors / clip / guardband state.

v2: avoid a performance regression on r600 when applied to
(pre-fork) stable branches

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Wed, 27 Sep 2017 15:05:07 +0000 (17:05 +0200)]

st/mesa: use R10G10B10X2 format where applicable

This is the last step of fixing
dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb_unsigned_int_2_10_10_10_rev
for radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Wed, 27 Sep 2017 13:25:35 +0000 (15:25 +0200)]

gallium: add PIPE_FORMAT_R10G10B10X2_UNORM

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Wed, 27 Sep 2017 13:25:10 +0000 (15:25 +0200)]

mesa/main: R10G10B10_(A2) formats are not color renderable in ES

The EXT_texture_type_2_10_10_10_REV (ES only) states the following issue:

"1. Should textures specified with this type be renderable?

UNRESOLVED: No. A separate extension could provide this functionality."

This partially fixes
dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.{rgb,rgba}_unsigned_int_2_10_10_10_rev

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Wed, 27 Sep 2017 13:24:31 +0000 (15:24 +0200)]

mesa/main: select the R10G10B10X2_UNORM internal format based on data type

ES requires it. This is a partial fix for
dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.rgb_unsigned_int_2_10_10_10_rev

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Mon, 11 Sep 2017 14:40:37 +0000 (16:40 +0200)]

glsl: do not set the 'smooth' qualifier by default on ES shaders

It leads to surprising states with integer inputs and outputs on
vertex processing stages (e.g. geometry stages). Instead, rely on the
driver to choose smooth interpolation by default.

We still allow varyings to match when one stage declares it as smooth
and the other declares it without interpolation qualifiers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Rob Clark [Sun, 1 Oct 2017 19:29:24 +0000 (15:29 -0400)]

freedreno: fix PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE

Fixes an assert in fd_acc_query_register_provider() about query provider
not already registered.

Fixes: 3f6b3d9d ("gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE")
Signed-off-by: Rob Clark <robdclark@gmail.com>

commit | commitdiff | tree

Eric Engestrom [Mon, 2 Oct 2017 10:34:52 +0000 (11:34 +0100)]

egl/wayland: simplify LIBGL_ALWAYS_SOFTWARE logic

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 29 Sep 2017 09:21:01 +0000 (11:21 +0200)]

radeonsi: fix a regression in integer cube map handling

A recent commit fixed the case of 8888 integer cube maps, which need the
workaround of replacing the data format with USCALED/SSCALED. However,
this broke the case of non-8888 integer cube maps; those still need the
fix of shifting the texture coordinates.

Fixes KHR-GL45.texture_gather.plain-gather-int-cube-array and similar.

Fixes: 6fb0c1013b35 ("radeonsi: workaround for gather4 on integer cube maps")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 29 Sep 2017 09:17:03 +0000 (11:17 +0200)]

amd/common: move ac_build_phi from radeonsi

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Samuel Pitoiset [Thu, 28 Sep 2017 11:55:31 +0000 (13:55 +0200)]

radv: remove unused radv_meta_state::btoi::render_pass handle

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Fri, 29 Sep 2017 15:11:12 +0000 (17:11 +0200)]

radv: do not check the number of levels when doing fast htile

We shouldn't reach this point because HTILE is only enabled
when the number of levels is 1.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Thu, 28 Sep 2017 11:50:56 +0000 (13:50 +0200)]

radv: cleanup radv_device_finish_meta_XXX() helpers

Unnecessary to double check that handles are not NULL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Fri, 29 Sep 2017 12:13:48 +0000 (14:13 +0200)]

radv: select the pipeline outside of emit_fast_clear_flush()

It can't change during the decompression pass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Fri, 29 Sep 2017 09:07:22 +0000 (11:07 +0200)]

radv: drop useless param in emit_depth_decomp()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Thu, 28 Sep 2017 09:18:01 +0000 (11:18 +0200)]

radv: drop useless check in depth_view_can_fast_clear()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Fri, 29 Sep 2017 07:43:29 +0000 (09:43 +0200)]

radv: add radv_subpass_clear_attachment() helper

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Thu, 28 Sep 2017 19:47:14 +0000 (21:47 +0200)]

radv: add radv_attachment_needs_clear() helper

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Thu, 28 Sep 2017 19:28:18 +0000 (21:28 +0200)]

radv: remove unused param in radv_handle_{cmask,dcc}_image_transition()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Thu, 28 Sep 2017 08:24:09 +0000 (10:24 +0200)]

radv: add radv_vi_dcc_enabled() helper

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 27 Sep 2017 19:56:20 +0000 (21:56 +0200)]

radv: do not need to double zero-init the meta state structures

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 27 Sep 2017 19:49:53 +0000 (21:49 +0200)]

radv: inline destroy_render_pass()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 27 Sep 2017 19:47:55 +0000 (21:47 +0200)]

radv: use pipeline handles instead of objects for meta clear operations

To be consistent with other meta operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Wed, 27 Sep 2017 19:21:57 +0000 (21:21 +0200)]

radv: inline blit2d_unbind_dst()

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Samuel Pitoiset [Fri, 29 Sep 2017 14:48:07 +0000 (16:48 +0200)]

radv: rework DCC/CMASK/FMASK/HTILE allocations

Add helpers and some comments to make the thing more readable.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>

commit | commitdiff | tree

Eric Engestrom [Fri, 29 Sep 2017 14:25:18 +0000 (15:25 +0100)]

meson: fix version typo + grammar

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>

commit | commitdiff | tree

Iago Toral Quiroga [Wed, 20 Sep 2017 07:22:51 +0000 (09:22 +0200)]

i965: skip reading unused slots at the begining of the URB for the FS

We can start reading the URB at the first offset that contains varyings
that are actually read in the URB. We still need to make sure that we
read at least one varying to honor hardware requirements.

This helps alleviate a problem introduced with 99df02ca26f61 for
separate shader objects: without separate shader objects we assign
locations sequentially, however, since that commit we have changed the
method for SSO so that the VUE slot assigned depends on the number of
builtin slots plus the location assigned to the varying. This fixed
layout is intended to help SSO programs by avoiding on-the-fly recompiles
when swapping out shaders, however, it also means that if a varying uses
a large location number close to the maximum allowed by the SF/FS units
(31), then the offset introduced by the number of builtin slots can push
the location outside the range and trigger an assertion.

This problem is affecting at least the following CTS tests for
enhanced layouts:

KHR-GL45.enhanced_layouts.varying_array_components
KHR-GL45.enhanced_layouts.varying_array_locations
KHR-GL45.enhanced_layouts.varying_components
KHR-GL45.enhanced_layouts.varying_locations

which use SSO and the the location layout qualifier to select such
location numbers explicitly.

This change helps these tests because for SSO we always have to include
things such as VARYING_SLOT_CLIP_DIST{0,1} even if the fragment shader is
very unlikely to read them, so by doing this we free builtin slots from
the fixed VUE layout and we avoid the tests to crash in this scenario.

Of course, this is not a proper fix, we'd still run into problems if someone
tries to use an explicit max location and read gl_ViewportIndex, gl_LayerID or
gl_CullDistancein in the FS, but that would be a much less common bug and we
can probably wait to see if anyone actually runs into that situation in a real
world scenario before making the decision that more aggresive changes are
required to support this without reverting 99df02ca26f61.

v2:
- Add a debug message when we skip clip distances (Ilia)
- we also need to account for this when we compute the urb setup
  for the fragment shader stage, so add a compiler util to compute
  the first slot that we need to read from the URB instead of
  replicating the logic in both places.

v3:
- Make the util more generic so it can account for all unused slots
  at the beginning of the URB, that will make it more useful (Ken).
- Drop the debug message, it was not what Ilia was asking for.

Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Matt Turner [Fri, 30 Jun 2017 22:10:17 +0000 (15:10 -0700)]

i965: Normalize types for FBL, FBH, etc

Allows the instructions to be compacted. The documentation claims that
some of these only accept UD types, even though the type doesn't change
the operation performed. Just normalize the types to ensure we get
instruction compaction.

The only functional changes are for FBL and CBIT (always use UD types)
and FBH (always use the same types).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Marek Olšák [Fri, 29 Sep 2017 20:31:21 +0000 (22:31 +0200)]

radeonsi: don't use the template keyword

for C++ editors

Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Marek Olšák [Fri, 29 Sep 2017 20:31:21 +0000 (22:31 +0200)]

glx: don't use the template keyword

for C++ editors

Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Marek Olšák [Fri, 29 Sep 2017 20:31:21 +0000 (22:31 +0200)]

gallium/vl: don't use the template keyword

for C++ editors

Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Marek Olšák [Fri, 29 Sep 2017 20:31:21 +0000 (22:31 +0200)]

egl/dri2: don't use the template keyword

for C++ editors

Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Benedikt Schemmer [Fri, 29 Sep 2017 19:02:13 +0000 (21:02 +0200)]

radeonsi/uvd: clean up si_video_buffer_create

V2: remove code duplication and one unnessecary variable, minor whitespace fix

Signed-off-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Marek Olšák [Fri, 29 Sep 2017 14:05:49 +0000 (16:05 +0200)]

radeonsi/uvd: fix planar formats broken since f70f6baaa3bb0f8b280ac2eaea69bb

Tested-by: Benedikt Schemmer <ben@besd.de>
Reviewed-by: Christian König <christian.koenig@amd.com>

commit | commitdiff | tree

Roland Scheidegger [Thu, 28 Sep 2017 01:45:04 +0000 (03:45 +0200)]

gallium: add new LOD opcode

The operation performed is all the same as LODQ, but with the usual
differences between dx10 and GL texture opcodes, that is separate resource
and sampler indices (plus result swizzling, and setting z/w channels
to zero).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Kamil Páral [Fri, 22 Sep 2017 21:31:08 +0000 (23:31 +0200)]

drirc: whitelist glthread for Outlast

FPS increase 10-20% in starting locations on Core i5-4570 +
Radeon R9 270.

commit | commitdiff | tree

Jan Vesely [Sat, 16 Sep 2017 22:55:48 +0000 (18:55 -0400)]

travis: Add clover build using llvm-5.0

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

commit | commitdiff | tree

Jan Vesely [Sat, 16 Sep 2017 22:54:24 +0000 (18:54 -0400)]

travis: Add clover build using llvm-4.0

llvm-4 needs gcc 4.8:
http://releases.llvm.org/4.0.1/docs/ReleaseNotes.html#non-comprehensive-list-of-changes-in-this-release

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

commit | commitdiff | tree

Jan Vesely [Sat, 16 Sep 2017 22:52:52 +0000 (18:52 -0400)]

travis: Add clover build using llvm-3.9

Use r600,radeonsi instead of i915
Update binutils, new linker is required for llvm-3.9:
https://www.ubuntuupdates.org/package/core/trusty/universe/updates/binutils-2.26

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>

commit | commitdiff | tree

Leo Liu [Fri, 29 Sep 2017 01:41:29 +0000 (21:41 -0400)]

st/va: add dst rect to avoid scale on deint

For 1080p video transcode, the height will be scaled to 1088 when deint
to progressive buffer. Set dst rect to make sure no scale.

Fixes: 3ad8687 "st/va: use new vl_compositor_yuv_deint_full() to deint"
Signed-off-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Acked-by: Andy Furniss <adf.lists@gmail.com>

commit | commitdiff | tree

Nicolai Hähnle [Sat, 16 Sep 2017 10:52:21 +0000 (12:52 +0200)]

radeonsi: emit DLDEXP and DFRACEXP TGSI opcodes

Note: this causes spurious regressions in some current piglit tests,
because the tests incorrectly assume that there is no denorm support for
doubles. I'm going to send out a fix for those tests as well.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 15 Sep 2017 14:59:09 +0000 (16:59 +0200)]

radeonsi: emit LDEXP opcode

The LLVM intrinsic has existed for a long time. The current name was
established in LLVM 3.9.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 15 Sep 2017 14:52:23 +0000 (16:52 +0200)]

st/glsl_to_tgsi: use LDEXP when available

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 15 Sep 2017 14:51:14 +0000 (16:51 +0200)]

gallium: add LDEXP TGSI instruction and corresponding cap

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 15 Sep 2017 16:47:52 +0000 (18:47 +0200)]

tgsi: infer that dst[1] of DFRACEXP is an integer

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Sat, 16 Sep 2017 10:50:42 +0000 (12:50 +0200)]

gallivm: add support for TGSI instructions with two outputs

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 15 Sep 2017 16:45:32 +0000 (18:45 +0200)]

gallivm: add dst register index to lp_build_tgsi_context::emit_store

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 15 Sep 2017 16:34:48 +0000 (18:34 +0200)]

tgsi: clarify the semantics of DFRACEXP

The status quo is quite the mess:

1. tgsi_exec will do a per-channel computation, and store the dst[0]
   result (significand) correctly for each channel. The dst[1] result
   (exponent) will be written to the first bit set in the writemask.
   So per-component calculation only works partially.

2. r600 will only do a single computation. It will replicate the
   exponent but not the significand.

3. The docs pretend that there's per-component calculation, but even
   get dst[0] and dst[1] confused.

4. Luckily, st_glsl_to_tgsi only ever emits single-component instructions,
   and kind-of assumes that everything is replicated, generating this for
   the dvec4 case:

     DFRACEXP TEMP[0].xy, TEMP[1].x, CONST[0][0].xyxy
     DFRACEXP TEMP[0].zw, TEMP[1].y, CONST[0][0].zwzw
     DFRACEXP TEMP[2].xy, TEMP[1].z, CONST[0][1].xyxy
     DFRACEXP TEMP[2].zw, TEMP[1].w, CONST[0][1].zwzw

Settle on the simplest behavior, which is single-component calculation
with replication, document it, and adjust tgsi_exec and r600.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 15 Sep 2017 15:47:27 +0000 (17:47 +0200)]

tgsi: fix the documentation of DLDEXP

Sourcing the exponent for the zw destination pair from Z is consistent
with both tgsi_exec and gallivm. In practice, st_glsl_to_tgsi always
generates per-channel instructions anyway.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 15 Sep 2017 15:40:05 +0000 (17:40 +0200)]

tgsi: infer that DLDEXP's second source has an integer type

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 15 Sep 2017 14:39:31 +0000 (16:39 +0200)]

glsl/lower_instruction: handle denorms and overflow in ldexp correctly

GLSL ES requires both, and while GLSL explicitly doesn't require correct
overflow handling, it does appear to require handling input inf/denorms
correctly.

Fixes dEQP-GLES31.functional.shaders.builtin_functions.precision.ldexp.*

Cc: mesa-stable@lists.freedesktop.org
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Thu, 28 Sep 2017 15:52:42 +0000 (17:52 +0200)]

util/queue: fix a race condition in the fence code

A tempting alternative fix would be adding a lock/unlock pair in
util_queue_fence_is_signalled. However, that wouldn't actually
improve anything in the semantics of util_queue_fence_is_signalled,
while making that test much more heavy-weight. So this lock/unlock
pair in util_queue_fence_destroy for "flushing out" other threads
that may still be in util_queue_fence_signal looks like the better
fix.

v2: rephrase the comment

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Gustaw Smolarczyk <wielkiegie@gmail.com>

commit | commitdiff | tree

Nicolai Hähnle [Thu, 28 Sep 2017 19:46:30 +0000 (21:46 +0200)]

r600: cleanup set_occlusion_query_state

This fixes a warning caused by the fork (note the change in the function
signature):

../../../../../mesa-src/src/gallium/drivers/r600/r600_state_common.c: In function ‘r600_init_common_state_functions’:
../../../../../mesa-src/src/gallium/drivers/r600/r600_state_common.c:2974:36: warning: assignment from incompatible pointer type [-Wincompatible-pointer-types]
rctx->b.set_occlusion_query_state = r600_set_occlusion_query_state;

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Thu, 28 Sep 2017 19:44:35 +0000 (21:44 +0200)]

r300: add missing case PIPE_SHADER_CAP_INT64_ATOMICS

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Sat, 23 Sep 2017 12:19:59 +0000 (14:19 +0200)]

radeonsi: fix border color translation for integer textures

This fixes the extremely unlikely case that an application uses
0x80000000 or 0x3f800000 as border color for an integer texture and
helps in the also, but perhaps slightly less, unlikely case that 1 is
used as a border color.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Sat, 23 Sep 2017 08:29:51 +0000 (10:29 +0200)]

radeonsi: clamp border colors for upgraded depth textures

The hardware does this automatically for unorm formats, but we need to
do it manually for unorm depth formats that have been upgraded to
Z32_FLOAT.

Fixes dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_depth
and others.

Fixes: d4d9ec55c589 ("radeonsi: implement TC-compatible HTILE")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Sat, 23 Sep 2017 11:20:25 +0000 (13:20 +0200)]

radeonsi: clamp depth comparison value only for fixed point formats

The hardware usually does this automatically. However, we upgrade
depth to Z32_FLOAT to enable TC-compatible HTILE, which means the
hardware no longer clamps the comparison value for us.

The only way to tell in the shader whether a clamp is required
seems to be to communicate an additional bit in the descriptor
table. While VI has some unused bits in the resource descriptor,
those bits have unfortunately all been used in gfx9. So we use
an unused bit in the sampler state instead.

Fixes dEQP-GLES3.functional.texture.shadow.2d.linear.equal_depth_component32f
and many other tests in dEQP-GLES3.functional.texture.shadow.*

Fixes: d4d9ec55c589 ("radeonsi: implement TC-compatible HTILE")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Thu, 21 Sep 2017 14:50:08 +0000 (16:50 +0200)]

radeonsi/gfx9: fix geometry shaders without output vertices

Not that those are super common or useful, but hey! Fun corner cases
of the API...

Fixes dEQP-GLES31.functional.geometry_shading.emit.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 22 Sep 2017 17:14:16 +0000 (19:14 +0200)]

amd/common: save an instruction in the build_cube_select sequence

Avoid a v_cndmask: the absolute value is free due to input modifiers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 22 Sep 2017 17:05:52 +0000 (19:05 +0200)]

amd/common: fix build_cube_select

Fix the custom cube coord selection sequence to be identical to
the hardware v_cubesc/tc and OpenGL spec. Affects texture sampling
with user-provided derivatives.

Fixes dEQP-GLES3.functional.shaders.texture_functions.texturegrad.*

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Fri, 22 Sep 2017 14:59:08 +0000 (16:59 +0200)]

st/glsl_to_tgsi: fix conditional assignments to packed shader outputs

Overriding the default (no-op) swizzle is clearly counter-productive,
since the whole point is putting the destination register as one of
the source operands so that it remains unmodified when the assignment
condition is false.

Fragment depth and stencil outputs are a special case due to how their
source swizzles are manipulated in translate_src when compiling to
TGSI.

Fixes dEQP-GLES2.functional.shaders.conditionals.if.*_vertex
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Thu, 21 Sep 2017 14:55:35 +0000 (16:55 +0200)]

st/glsl_to_tgsi: fix a use-after-free in merge_two_dsts

Found by address sanitizer.

The loop here tries to be safe, but in doing so, it ends up doing
exactly the wrong thing: the safe foreach is for when the loop
variable (inst) could be deleted and nothing else. However, this
particular can delete inst's successor, but not inst itself.

Fixes: 8c6a0ebaad72 ("st/mesa: add st fp64 support (v7.1)")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>

commit | commitdiff | tree

Nicolai Hähnle [Sat, 23 Sep 2017 20:34:10 +0000 (22:34 +0200)]

radeonsi: move descriptor logs to after corresponding draw/compute packet

It has to happen after descriptor uploads since otherwise we'll print out
the wrong GPU list / incorrectly claim descriptor corruption.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Mon, 18 Sep 2017 13:44:50 +0000 (15:44 +0200)]

amd/common: remove ac_shader_abi::chip_class

Redundant with the recently added ac_llvm_context::chip_class.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Nicolai Hähnle [Thu, 14 Sep 2017 14:17:31 +0000 (16:17 +0200)]

gallium/radeon: fix a comment

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Iago Toral Quiroga [Wed, 27 Sep 2017 09:36:31 +0000 (11:36 +0200)]

i965/fs: force pull model for 64-bit GS inputs

Triggering the push model when 64-bit inputs are involved is not easy due to
the constrains on the maximum number of registers that we allow for this mode,
however, for GS with 'points' primitive type and just a couple of double
varyings we can trigger this and it just doesn't work because the
implementation is not 64-bit aware at all. For now, let's make sure that we
don't attempt this model whith 64-bit inputs and we always fall back to pull
model for them.

Also, don't enable the VUE handles in the thread payload on the fly when we
find an input for which we need the pull model, this is not safe: if we need
to resort to the pull model we need to account for that when we setup the
thread payload so we compute the first non-payload register properly. If we
didn't do that correctly and we enable it on-the-fly here then we will end up
VUE handles on the first non-payload register which will probably lead to
GPU hangs. Instead, always enable the VUE handles for the pull model so we
can safely use them when needed. The GS is going to resort to pull model
almost in every situation anyway, so this shouldn't make a significant
difference and it makes things easier and safer.

v2: Always enable the VUE handles for pull model, this is easier and safer
and the GS is going to fallback to pull model almost always anyway (Ken)

v3: Only clamp the URB read length if we are over the maximum reserved for
push inputs as we were doing in the original code (Ken).

v4: No need to clamp the urb read length if invocations > 1

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Jason Ekstrand [Thu, 28 Sep 2017 16:58:59 +0000 (09:58 -0700)]

i965/link: Use prog->nir instead of creating a temporary

This way, when NIR_PASS_V makes a clone of the shader (for testing
nir_clone), the new and lowered version gets re-assigned to prog->nir.

[jordan.l.justen@intel.com: Tested NIR_TEST_CLONE=1 with valgrind]
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Thu, 28 Sep 2017 16:58:38 +0000 (09:58 -0700)]

i965/link: Make more use of NIR_PASS

[jordan.l.justen@intel.com: Tested NIR_TEST_CLONE=1 with valgrind]
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Jason Ekstrand [Thu, 28 Sep 2017 16:55:15 +0000 (09:55 -0700)]

i965/link: Make better use of temporary variables

The way NIR_PASS works (and, by extension, nir_optimize) is that they
may clone the shader and throw the old one away. (We use this for
testing nir_clone.) It's better if we just make a temporary variable,
use it for everything, and re-assign to the gl_program at the end.

[jordan.l.justen@intel.com: Tested NIR_TEST_CLONE=1 with valgrind]
Tested-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>

commit | commitdiff | tree

Thomas Helland [Wed, 27 Sep 2017 19:24:06 +0000 (21:24 +0200)]

util: fix in-class initialization of static member

Fix a compile error with G++ 4.4

string_buffer_test.cpp:43: error: ISO C++ forbids initialization of
member ‘str1’
string_buffer_test.cpp:43: error: making ‘str1’ static
string_buffer_test.cpp:43: error: invalid in-class initialization of
static data member of non-integral type ‘const char*’

Tested-by: Vinson Lee <vlee at freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103002

commit | commitdiff | tree

Eric Engestrom [Thu, 28 Sep 2017 17:08:59 +0000 (18:08 +0100)]

REVIEWERS: add myself as a Meson reviewer

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>

commit | commitdiff | tree

Eric Engestrom [Thu, 28 Sep 2017 12:36:09 +0000 (13:36 +0100)]

REVIEWERS: add Meson

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>

commit | commitdiff | tree

Dylan Baker [Wed, 27 Sep 2017 23:24:38 +0000 (16:24 -0700)]

meson: remove duplicate libisl dependency in anv

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>

commit | commitdiff | tree

Brian Paul [Tue, 26 Sep 2017 15:57:02 +0000 (09:57 -0600)]

svga: add missing PIPE_SHADER_CAP_INT64_ATOMICS switch cases

Silences a compiler warning.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>

commit | commitdiff | tree

Brian Paul [Tue, 26 Sep 2017 15:58:55 +0000 (09:58 -0600)]

svga: trivial whitespace clean-ups in svga_screen.c

commit | commitdiff | tree

Brian Paul [Fri, 8 Sep 2017 22:42:12 +0000 (16:42 -0600)]

gallium/util: use new util_vasprintf() function

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Brian Paul [Fri, 8 Sep 2017 22:41:16 +0000 (16:41 -0600)]

util: add util_vasprintf() for Windows (v2)

We don't have vasprintf() on Windows so we need to implement it ourselves.

v2: compute actual length of output string, per Nicolai Hähnle.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>

commit | commitdiff | tree

Brian Paul [Fri, 8 Sep 2017 22:40:52 +0000 (16:40 -0600)]

st/mesa: don't call close() on Windows

Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Neha Bhende [Fri, 14 Jul 2017 17:59:46 +0000 (10:59 -0700)]

svga: start advertising PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION

Since our driver support arb_provoking_vertex, we can start
advertising PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION
Fixes ./clipflat & ./arb-provoking-vertex-render piglit tests

Tested piglit, glretrace on Hw 11 and Hw 13

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>

commit | commitdiff | tree

Marek Olšák [Wed, 27 Sep 2017 11:17:21 +0000 (13:17 +0200)]

mesa: fix texture updates for ATI_fragment_shader

Cc: 17.1 17.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>

commit | commitdiff | tree

Lucas Stach [Wed, 6 Sep 2017 12:30:40 +0000 (14:30 +0200)]

etnaviv: optimize RS transfers

Currently we are blitting the whole resource when the RS is used to
de-/tile a resource. This can be very inefficient for large resources
where the transfer is only changing a small part of the resource
(happens a lot with glTexSubImage2D).

Optimize this by only blitting the tile aligned subregion of the
resource, which the transfer is going to change.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>

commit | commitdiff | tree

Lucas Stach [Wed, 6 Sep 2017 12:28:21 +0000 (14:28 +0200)]

etnaviv: add resource subregion copy

This is useful if we only need to copy part of a larger resource, mostly
when using the RS engine to de-/tile on pipe transfers.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>

commit | commitdiff | tree

Lucas Stach [Wed, 6 Sep 2017 12:24:30 +0000 (14:24 +0200)]

etnaviv: support tile aligned RS blits

The RS can blit abitrary tile aligned subregions of a resource by
adjusting the buffer offset.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-By: Wladimir J. van der Laan <laanwj@gmail.com>

commit | commitdiff | tree

Leo Liu [Tue, 26 Sep 2017 13:11:52 +0000 (09:11 -0400)]

st/va: use pipe transfer_map to map upload buffer

The function pipe_buffer_map() is only for linear pipe buffer,
with height as 0, and it's not for any 2D textures.

Signed-off-by: Leo Liu <leo.liu@amd.com>
Cc: mesa-stable@lists.freedesktop.org
Cc: Mark Thompson <sw@jkqxz.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>

commit | commitdiff | tree

Gwan-gyeong Mun [Tue, 26 Sep 2017 08:14:17 +0000 (01:14 -0700)]

anv: add an assertion in genX(BeginCommandBuffer)

To check a valid usage requirement.

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>

commit | commitdiff | tree

Gwan-gyeong Mun [Tue, 26 Sep 2017 08:14:16 +0000 (01:14 -0700)]

radv: add an assertion in radv_BeginCommandBuffer()

To check a valid usage requirement.

CID: 1401616

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>

commit | commitdiff | tree

Gwan-gyeong Mun [Thu, 24 Aug 2017 14:17:33 +0000 (23:17 +0900)]

gallium/docs: add reference links for resource_create method

It adds reference links for arguments usage and bind of resource_create().

Signed-off-by: Mun Gwan-gyeong <elongbug@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>