mesa.git
8 years agoradeonsi: disable ReZ
Marek Olšák [Wed, 12 Oct 2016 19:47:41 +0000 (21:47 +0200)]
radeonsi: disable ReZ

This is a serious performance fix. Discovered by luck.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94354

Cc: 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: implement TC-compatible HTILE
Marek Olšák [Tue, 11 Oct 2016 21:19:46 +0000 (23:19 +0200)]
radeonsi: implement TC-compatible HTILE

so that decompress blits aren't needed and depth texturing needs less
memory bandwidth.

Z16 and Z24 are promoted to Z32_FLOAT by the driver, because TC-compatible
HTILE only supports Z32_FLOAT. This doubles memory footprint for Z16.
The format promotion is not visible to state trackers.

This is part of TC-compatible renderbuffer compression, which has 3 parts:
DCC, HTILE, FMASK. Only TC-compatible FMASK compression is missing now.

I don't see a measurable increase in performance though.

(I tested Talos Principle and DiRT: Showdown, the latter is improved by
 0.5%, which is almost noise, and it originally used layered Z16,
 so at least we know that Z16 promoted to Z32F isn't slower now)

Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agogallium: add PIPE_RESOURCE_FLAG_TEXTURING_MORE_LIKELY
Marek Olšák [Wed, 12 Oct 2016 01:06:08 +0000 (03:06 +0200)]
gallium: add PIPE_RESOURCE_FLAG_TEXTURING_MORE_LIKELY

For performance tuning in drivers. It filters out window system
framebuffers and OpenGL renderbuffers.

radeonsi will use this to guess whether a depth buffer will be read
by a shader. There is no guarantee about what will actually happen.

This is a departure from PIPE_BIND flags which are defined to be strict
but they are useless in practice.

Acked-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: fix regression in image atomics
Nicolai Hähnle [Thu, 13 Oct 2016 14:03:06 +0000 (16:03 +0200)]
radeonsi: fix regression in image atomics

Caused by a bad rebase when pushing commit 76a940893.

8 years agost/mesa: fix vertex elements setup for doubles
Nicolai Hähnle [Mon, 10 Oct 2016 18:20:22 +0000 (20:20 +0200)]
st/mesa: fix vertex elements setup for doubles

Whether one or two slots are taken up by one API array depends on the
vertex shader, not on how the array is configured. When an array is
set up with fewer components than the shader expects, the high components
are undefined.

Fixes GL45-CTS.vertex_attrib_binding.basic-inputL-case1.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Dave Airlie <airlied@redhat.com>
8 years agost/glsl_to_tgsi: remove unnecessary ir_instruction argument from get_opcode
Nicolai Hähnle [Mon, 10 Oct 2016 09:44:43 +0000 (11:44 +0200)]
st/glsl_to_tgsi: remove unnecessary ir_instruction argument from get_opcode

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
8 years agost/glsl_to_tgsi: fix textureGatherOffset with indirectly loaded offsets
Nicolai Hähnle [Mon, 10 Oct 2016 09:44:03 +0000 (11:44 +0200)]
st/glsl_to_tgsi: fix textureGatherOffset with indirectly loaded offsets

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
8 years agost/glsl_to_tgsi: simplify translate_tex_offset
Nicolai Hähnle [Sun, 9 Oct 2016 20:28:30 +0000 (22:28 +0200)]
st/glsl_to_tgsi: simplify translate_tex_offset

This fixes a bug with offsets from uniforms which seems to have only been
noticed as a crash in piglit's
arb_gpu_shader5/compiler/builtin-functions/fs-gatherOffset-uniform-offset.frag
on radeonsi.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
8 years agoradeonsi: fix the coordinate overloading of llvm.amdgcn.image.atomic.cmpswap.*
Nicolai Hähnle [Mon, 10 Oct 2016 13:09:40 +0000 (15:09 +0200)]
radeonsi: fix the coordinate overloading of llvm.amdgcn.image.atomic.cmpswap.*

Fixes GL45-CTS.shader_image_load_store.basic-allTargets-atomic*

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agoradv: Return correct result in EnumeratePhysicalDevices
Nicolas Koch [Wed, 12 Oct 2016 11:55:46 +0000 (13:55 +0200)]
radv: Return correct result in EnumeratePhysicalDevices

If pPhysicalDevices is too small for all physical devices,
the driver must return VK_INCOMPLETE. Since only a single
physical device is supported, this is only the case when
pPhysicalDeviceCount == 0 && pPhysicalDevices != NULL.

Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agost/mesa: only flip stipple pattern for winsys fbo's
Ilia Mirkin [Wed, 12 Oct 2016 18:01:34 +0000 (14:01 -0400)]
st/mesa: only flip stipple pattern for winsys fbo's

Gallium is completely oblivious to whether the fbo is flipped or not.
Only flip the stipple pattern when the fbo is flipped as well. Otherwise
the driver has no idea when to unflip the pattern.

Fixes bin/gl-2.1-polygon-stipple-fs -fbo

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tested-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agoswr: automake: add ar_eventhandlerfile_h.template to the tarball
Emil Velikov [Wed, 12 Oct 2016 15:06:47 +0000 (16:06 +0100)]
swr: automake: add ar_eventhandlerfile_h.template to the tarball

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoradv: add all headers to the sources list
Emil Velikov [Wed, 12 Oct 2016 00:03:25 +0000 (01:03 +0100)]
radv: add all headers to the sources list

Otherwise they'll be missing from the tarball and the build will fail.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agonvc0/ir: fix textureGather with a single offset
Ilia Mirkin [Wed, 12 Oct 2016 14:24:59 +0000 (10:24 -0400)]
nvc0/ir: fix textureGather with a single offset

Recent fix for non-const offsets broke the case of a single offset (vs 4
offsets). The later code relies on the offs array to contain null values
to tell whether they should be added onto the srcs list.

Fixes: 5239bd592 ("nvc0/ir: fix overwriting of value backing non-constant gather offset")
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
8 years agonv50/ir: copy over value's register id when resolving merge of a phi
Ilia Mirkin [Mon, 10 Oct 2016 20:57:50 +0000 (16:57 -0400)]
nv50/ir: copy over value's register id when resolving merge of a phi

The offset needs to be properly copied over to the phi value, otherwise
it will get assigned to the base of the merge instead of the proper
location.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
8 years agost/mesa: enable ARB_enhanced_layouts and turn the cap on
Nicolai Hähnle [Thu, 6 Oct 2016 21:10:22 +0000 (23:10 +0200)]
st/mesa: enable ARB_enhanced_layouts and turn the cap on

v2: mark llvmpipe & softpipe properly as well (Jason Wood)

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
8 years agost/glsl_to_tgsi: adjust swizzles and writemasks for explicit components
Nicolai Hähnle [Fri, 7 Oct 2016 10:19:33 +0000 (12:19 +0200)]
st/glsl_to_tgsi: adjust swizzles and writemasks for explicit components

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
8 years agost/glsl_to_tgsi: explicitly track all input and output declaration
Nicolai Hähnle [Fri, 7 Oct 2016 10:19:11 +0000 (12:19 +0200)]
st/glsl_to_tgsi: explicitly track all input and output declaration

In order to be able to emit overlapping input and output array
declarations, we flip the logic of emitting those declarations on its
head: rather than iterating over slots and emitting the corresponding
declarations, we iterate over the declarations from GLSL and emit those.

v2: fix some regressions related to structs
v3: fix a regression in geometry and tessellation shader array handling

Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v2)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v2)
8 years agost/glsl_to_tgsi: mark "gaps" in input/output arrays as used
Nicolai Hähnle [Fri, 7 Oct 2016 19:30:05 +0000 (21:30 +0200)]
st/glsl_to_tgsi: mark "gaps" in input/output arrays as used

In some cases, a shader may have an input/output array but not use some
entries in the middle. This happens with eON games, for example.

We emit declarations that cover the entire array range even if there are
some unused gaps. This patch now reflects that in the InputsRead etc.
fields to ensure the various input/outputMapping arrays are actually
correct, which will be important when we re-jiggle the way declarations
are emitted.

v2: fix a typo (Edward O'Callaghan)

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
8 years agost/glsl_to_tgsi: disable on-the-fly peephole for 64-bit operations
Nicolai Hähnle [Fri, 7 Oct 2016 14:15:30 +0000 (16:15 +0200)]
st/glsl_to_tgsi: disable on-the-fly peephole for 64-bit operations

This optimization is incorrect with 64-bit operations, because the
channel-splitting logic in emit_asm ends up being applied twice to
the source operands.

A lucky coincidence of how the writemask test works resulted in this
optimization basically never being applied anyway. As far as I can tell,
the only case where it would (incorrectly) have been applied is something
like

    dvec2 d;
    float x = (float)d.y;

which nobody seems to have ever done. But the moral equivalent does occur
in one of the component layout piglit test.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
8 years agost/glsl_to_tgsi: simpler fixup of empty writemasks
Nicolai Hähnle [Fri, 7 Oct 2016 10:49:36 +0000 (12:49 +0200)]
st/glsl_to_tgsi: simpler fixup of empty writemasks

Empty writemasks mean "copy everything", so we can always just use the number
of vector elements (which uses the GLSL meaning here, i.e. each double is a
single element/writemask bit).

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
8 years agost/glsl_to_tgsi: explicit handling of writemask for depth/stencil export
Nicolai Hähnle [Fri, 7 Oct 2016 15:33:07 +0000 (17:33 +0200)]
st/glsl_to_tgsi: explicit handling of writemask for depth/stencil export

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
8 years agoglsl: dump explicit location when printing IR
Nicolai Hähnle [Thu, 6 Oct 2016 21:10:10 +0000 (23:10 +0200)]
glsl: dump explicit location when printing IR

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
8 years agotgsi/ureg: add ureg_DECL_output_layout
Nicolai Hähnle [Wed, 12 Oct 2016 15:24:37 +0000 (17:24 +0200)]
tgsi/ureg: add ureg_DECL_output_layout

For specifying an exact location/component.

v2: change the order of parameters (Dave)

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
8 years agotgsi/ureg: add layout/component input declarations
Nicolai Hähnle [Fri, 7 Oct 2016 10:07:21 +0000 (12:07 +0200)]
tgsi/ureg: add layout/component input declarations

v2: change the order of parameters (Dave)

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
8 years agotgsi/scan: fix num_inputs/num_outputs for shaders with overlapping arrays
Nicolai Hähnle [Fri, 7 Oct 2016 10:53:55 +0000 (12:53 +0200)]
tgsi/scan: fix num_inputs/num_outputs for shaders with overlapping arrays

v2: remove a tautological left-over assert (Marek)

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> (v1)
Reviewed-by: Dave Airlie <airlied@redhat.com> (v1)
8 years agogallium: add PIPE_CAP_TGSI_ARRAY_COMPONENTS
Nicolai Hähnle [Fri, 7 Oct 2016 07:42:55 +0000 (09:42 +0200)]
gallium: add PIPE_CAP_TGSI_ARRAY_COMPONENTS

This is a screen cap because drivers are expected to support it either
for all shader types or for none of them.

Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
8 years agoradeonsi: Use the new image load/store intrinsic signatures
Tom Stellard [Tue, 11 Oct 2016 21:06:54 +0000 (21:06 +0000)]
radeonsi: Use the new image load/store intrinsic signatures

This patch requires LLVM r284024 or newer.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: Add function for converting LLVM type to intrinsic string
Tom Stellard [Tue, 11 Oct 2016 20:23:52 +0000 (20:23 +0000)]
radeonsi: Add function for converting LLVM type to intrinsic string

The existing function only worked for integer types.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: Refactor image store/load intrinsic name creation
Tom Stellard [Tue, 11 Oct 2016 16:43:36 +0000 (16:43 +0000)]
radeonsi: Refactor image store/load intrinsic name creation

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agowinsys/amdgpu: fix infinite loop w/ RADEON_NOOP=1 caused by unsubmitted fences
Marek Olšák [Mon, 10 Oct 2016 20:24:27 +0000 (22:24 +0200)]
winsys/amdgpu: fix infinite loop w/ RADEON_NOOP=1 caused by unsubmitted fences

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: fix R600_DEBUG=precompile for shader-db
Marek Olšák [Tue, 11 Oct 2016 14:55:41 +0000 (16:55 +0200)]
radeonsi: fix R600_DEBUG=precompile for shader-db

radeonsi no longer supports pixel shaders without interpolation optimizations,
which led to assertion failures in si_shader_ps when running shader-db.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: use TC write-back instead of full cache invalidation
Marek Olšák [Mon, 10 Oct 2016 16:51:24 +0000 (18:51 +0200)]
radeonsi: use TC write-back instead of full cache invalidation

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: implement TC L2 write-back (flush) without cache invalidation
Marek Olšák [Mon, 10 Oct 2016 16:49:22 +0000 (18:49 +0200)]
radeonsi: implement TC L2 write-back (flush) without cache invalidation

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoradeonsi: don't invalidate VMEM L1 for memory barriers for index buffers
Marek Olšák [Mon, 10 Oct 2016 15:39:43 +0000 (17:39 +0200)]
radeonsi: don't invalidate VMEM L1 for memory barriers for index buffers

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agonv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c)
Samuel Pitoiset [Thu, 6 Oct 2016 23:16:24 +0000 (01:16 +0200)]
nv50/ir: optimize ADD(SHL(a, b), c) to SHLADD(a, b, c)

total instructions in shared programs :2286901 -> 2284473 (-0.11%)
total gprs used in shared programs    :335256 -> 335273 (0.01%)
total local used in shared programs   :31968 -> 31968 (0.00%)

                local        gpr       inst      bytes
    helped           0          41         852         852
      hurt           0          44          23          23

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
8 years agomapi: fix out-of-tree build dependencies
Nicolai Hähnle [Tue, 11 Oct 2016 13:43:44 +0000 (15:43 +0200)]
mapi: fix out-of-tree build dependencies

We shouldn't be using wildcard here in the first place, but changing that
is some effort. As it stands, make -p confirms that glapi_gen_mapi_deps only
contains mapi_abi.py when building outside the Mesa tree.

As a result, only some of the tables were updated when XML files change, but
not the tables for shared glapi. This change ensures that we pick up the
XML files and scripts from the source tree as dependencies also for shared
glapi.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agodraw: initialize shader inputs
Roland Scheidegger [Tue, 11 Oct 2016 22:00:28 +0000 (00:00 +0200)]
draw: initialize shader inputs

This should make the code more robust if a shader tries to use inputs which
aren't defined by the vertex element layout (which usually shouldn't happen).

No piglit change.

Reviewed-by: Brian Paul <brianp@vmware.com>
8 years agoradv: trivial case stmt style fixups
Edward O'Callaghan [Tue, 11 Oct 2016 00:43:09 +0000 (11:43 +1100)]
radv: trivial case stmt style fixups

Relocate a 'default:' to the end of a case stmt and fix an
indent issue.

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
8 years agoanv: Return correct result in EnumeratePhysicalDevices
Nicolas Koch [Thu, 6 Oct 2016 19:21:32 +0000 (21:21 +0200)]
anv: Return correct result in EnumeratePhysicalDevices

If pPhysicalDevices is too small for all physical devices,
the driver must return VK_INCOMPLETE.
Since only a single physical device is supported, this is only the case
when pPhysicalDeviceCount == 0 && pPhysicalDevices != NULL.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv: Allow vp_info to be NULL in 3DSTATE_CLIP code.
Kenneth Graunke [Thu, 29 Sep 2016 18:52:34 +0000 (11:52 -0700)]
anv: Allow vp_info to be NULL in 3DSTATE_CLIP code.

pViewportState may be NULL if rasterization is disabled.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoanv: Fix anv_pipeline_validate_create_info assertions.
Kenneth Graunke [Thu, 29 Sep 2016 18:42:43 +0000 (11:42 -0700)]
anv: Fix anv_pipeline_validate_create_info assertions.

Many of these can be "NULL if the pipeline has rasterization disabled."
Also, we should assert that pMultisampleState exists.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agotrace: add invalidate_resource callback
Ilia Mirkin [Tue, 11 Oct 2016 03:17:20 +0000 (23:17 -0400)]
trace: add invalidate_resource callback

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
8 years agoradv/winsys: Fix radv_amdgpu_cs_grow min_size argument. (v2)
Gustaw Smolarczyk [Thu, 6 Oct 2016 17:50:47 +0000 (19:50 +0200)]
radv/winsys: Fix radv_amdgpu_cs_grow min_size argument. (v2)

It's supposed to be how much at least we want to grow the cs, not the
minimum size of the cs after growth.

v2: Unbreak use_ib_bos.
    Don't mask the ib_size when !use_ib_bos, since it's not needed.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradv: fix strict aliasing violation
Grigori Goronzy [Tue, 11 Oct 2016 22:47:20 +0000 (00:47 +0200)]
radv: fix strict aliasing violation

Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradv: fix uninitialized variables
Grigori Goronzy [Tue, 11 Oct 2016 22:47:19 +0000 (00:47 +0200)]
radv: fix uninitialized variables

This gets rid of "may be used uninitialized" compiler warnings.

Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradv: add missing unreachable
Grigori Goronzy [Tue, 11 Oct 2016 22:47:18 +0000 (00:47 +0200)]
radv: add missing unreachable

Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradv: remove the validation layer and some related bits.
Dave Airlie [Tue, 11 Oct 2016 22:37:14 +0000 (08:37 +1000)]
radv: remove the validation layer and some related bits.

As pointed out by Emil this isn't used in anv anymore,
and it was totally unused in radv anyways.

Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradv: drop entrypoint split out.
Dave Airlie [Tue, 11 Oct 2016 05:57:58 +0000 (15:57 +1000)]
radv: drop entrypoint split out.

radv really doesn't need different dispatch per gen yet,
there really isn't that many differences yet.

Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradv: drop the RADV_CALL macro.
Dave Airlie [Tue, 11 Oct 2016 05:54:52 +0000 (15:54 +1000)]
radv: drop the RADV_CALL macro.

This is leftover from anv, and we really never needed it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradv: check driver name before calling amdgpu.
Dave Airlie [Tue, 11 Oct 2016 05:21:25 +0000 (15:21 +1000)]
radv: check driver name before calling amdgpu.

This checks the kernel driver name is amdgpu before calling
libdrm_amdgpu.

This avoids the following error:
amdgpu_device_initialize: DRM version is 1.6.0 but this driver is only compatible with 3.x.x

when run on a machine with i915 graphics as well as amdgpu.

Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradv: fix memory leak from physical device if wsi fails
Dave Airlie [Tue, 11 Oct 2016 22:52:56 +0000 (08:52 +1000)]
radv: fix memory leak from physical device if wsi fails

Inspired by patch from Edward O'Callaghan <funfunctor@folklore1984.net>
which didn't do it right.

Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradv/winsys: Fix mem leak at failed do_winsys_init() call site
Edward O'Callaghan [Tue, 11 Oct 2016 11:43:07 +0000 (22:43 +1100)]
radv/winsys: Fix mem leak at failed do_winsys_init() call site

Probably unlikely however ensure we don't leak a heap allocation
on the fail path.

V.2:
 also fix missing 'amdgpu_device_deinitialize()' calls (Emil Velikov).

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradv/winsys: Trivial style and readability fixups
Edward O'Callaghan [Tue, 11 Oct 2016 09:04:47 +0000 (20:04 +1100)]
radv/winsys: Trivial style and readability fixups

Drop/add a few newlines where appropriate and drop a couple of
unnessary braces.

Signed-off-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoradeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the kernel allows it
Marek Olšák [Mon, 10 Oct 2016 11:23:55 +0000 (13:23 +0200)]
radeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the kernel allows it

Reviewed-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
8 years agoswr: [rasterizer archrast] update proto file
Tim Rowley [Mon, 10 Oct 2016 23:32:31 +0000 (18:32 -0500)]
swr: [rasterizer archrast] update proto file

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
8 years agoswr: [rasterizer archrast] add support for stats files
Tim Rowley [Mon, 10 Oct 2016 16:41:33 +0000 (11:41 -0500)]
swr: [rasterizer archrast] add support for stats files

Only stat and counter events are saved to the event files.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
8 years agoswr: [rasterizer jitter] remove architecture override
Tim Rowley [Mon, 10 Oct 2016 16:07:03 +0000 (11:07 -0500)]
swr: [rasterizer jitter] remove architecture override

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
8 years agoswr: [rasterizer jitter] adjust jitmanager assert
Tim Rowley [Fri, 7 Oct 2016 17:24:52 +0000 (12:24 -0500)]
swr: [rasterizer jitter] adjust jitmanager assert

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
8 years agoswr: [rasterizer] eliminate unused label warnings on gcc
Tim Rowley [Fri, 7 Oct 2016 14:52:19 +0000 (09:52 -0500)]
swr: [rasterizer] eliminate unused label warnings on gcc

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
8 years agoswr: [rasterizer core] implement depth bounds test
Tim Rowley [Fri, 7 Oct 2016 02:06:59 +0000 (21:06 -0500)]
swr: [rasterizer core] implement depth bounds test

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
8 years agoswr: [rasterizer core] update/add formats
Tim Rowley [Thu, 6 Oct 2016 21:26:56 +0000 (16:26 -0500)]
swr: [rasterizer core] update/add formats

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
8 years agoswr: [rasterizer core] SwrStoreTiles api change
Tim Rowley [Thu, 6 Oct 2016 18:22:35 +0000 (13:22 -0500)]
swr: [rasterizer core] SwrStoreTiles api change

SwrStoreTiles now takes a mask of surfaces to store.  Reduces
overhead when storing multiple render targets.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
8 years agoswr: [rasterizer scripts] add ENABLE_ASSERT_DIALOGS knob for windows
Tim Rowley [Wed, 5 Oct 2016 18:48:40 +0000 (13:48 -0500)]
swr: [rasterizer scripts] add ENABLE_ASSERT_DIALOGS knob for windows

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
8 years agoswr: [rasterizer archrast] add mako template
Tim Rowley [Wed, 5 Oct 2016 18:45:12 +0000 (13:45 -0500)]
swr: [rasterizer archrast] add mako template

Add template for generating code to save events to a file.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
8 years agoswr: [rasterizer core] disable cull for rect_list
Tim Rowley [Tue, 4 Oct 2016 18:36:12 +0000 (13:36 -0500)]
swr: [rasterizer core] disable cull for rect_list

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
8 years agoswr: [rasterizer core] add support for "RAW" surface format
Tim Rowley [Tue, 4 Oct 2016 18:14:32 +0000 (13:14 -0500)]
swr: [rasterizer core] add support for "RAW" surface format

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
8 years agoswr: [rasterizer core] align Macrotile FIFO memory to SIMD size
Tim Rowley [Tue, 4 Oct 2016 17:59:30 +0000 (12:59 -0500)]
swr: [rasterizer core] align Macrotile FIFO memory to SIMD size

Align and use streaming store instructions for BE fifo queues.
Provides slightly faster enqueue and doesn't pollute the caches.
Add appropriate memory fences to ensure streaming writes are
globally visible.

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
8 years agoswr: [rasterizer common] remove threadviz code
Tim Rowley [Mon, 3 Oct 2016 21:39:10 +0000 (16:39 -0500)]
swr: [rasterizer common] remove threadviz code

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
8 years agoswr: [rasterizer memory] split load/store for compile speed
Tim Rowley [Fri, 7 Oct 2016 17:07:07 +0000 (12:07 -0500)]
swr: [rasterizer memory] split load/store for compile speed

Signed-off-by: Tim Rowley <timothy.o.rowley@intel.com>
8 years agoegl: add eglSwapBuffersWithDamageKHR
Eric Engestrom [Mon, 10 Oct 2016 16:33:17 +0000 (17:33 +0100)]
egl: add eglSwapBuffersWithDamageKHR

EGL_KHR_swap_buffers_with_damage is actually already supported, as it is
technically nothing but a rename of EGL_EXT_swap_buffers_with_damage.

To that effect, both extension are advertised depending on the same
condition, and the new entrypoint simply redirects to the previous one.

Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
8 years agointel/genxml: fix building rules for aubinator required headers
Mauro Rossi [Mon, 10 Oct 2016 21:43:42 +0000 (23:43 +0200)]
intel/genxml: fix building rules for aubinator required headers

New generated headers were introduced by commit 63a366a
"intel: aubinator: generate a standalone binary"

Android does not need aubinator yet, so in order to avoid building error,
aubinator required new genxml headers are defined in a separate list.

If required, building rules for Android will be added later.
[Emil Velikov: don't use a _HEADERS variable name (causes warnings)]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoradv: automake: move libamdgpu_addrlib.la to VULKAN_LIB_DEPS
Emil Velikov [Mon, 10 Oct 2016 16:07:33 +0000 (17:07 +0100)]
radv: automake: move libamdgpu_addrlib.la to VULKAN_LIB_DEPS

The static library is analogous to the intel ISL, which is required for
both hardware and (to be added) testing library.

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoradv: automake: remove unused variables
Emil Velikov [Mon, 10 Oct 2016 16:02:23 +0000 (17:02 +0100)]
radv: automake: remove unused variables

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agoradv: automake: include the python scripts/formats table in the tarball
Emil Velikov [Mon, 10 Oct 2016 16:01:47 +0000 (17:01 +0100)]
radv: automake: include the python scripts/formats table in the tarball

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
8 years agomesa: fix error handling in _mesa_TransformFeedbackVaryings
Tapani Pälli [Mon, 10 Oct 2016 06:49:36 +0000 (09:49 +0300)]
mesa: fix error handling in _mesa_TransformFeedbackVaryings

Patch changes function to use _mesa_lookup_shader_program_err both
in TransformFeedbackVaryings and GetTransformFeedbackVarying that
handles errors correctly for invalid values of shader program.

Fixes following dEQP test:
   dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.transform_feedback_varyings

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98135
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
8 years agoi965: solve cubemap negative x/y/z faces buffer offset issue in dEQP.
Xu,Randy [Sat, 8 Oct 2016 08:15:59 +0000 (16:15 +0800)]
i965: solve cubemap negative x/y/z faces buffer offset issue in dEQP.

Add the miptree level/slice x/y_offset when count the surface offset
in brw_emit_surface_state. The surface offset has two parts, one is
from mt->offset, which should be 32 aligned in width/height for tiled
buffer; another is from mt->level[current_level].slice[current_slice].
x/y_offset.

This fix will solve 12 deqp failure
dEQP-EGL.functional.image.create.gles2_cubemap_negative_*_texture

Signed-off-by: Xu,Randy <randy.xu@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
8 years agoi915g: fix incorrect gl_FragCoord value
Nicholas Bishop [Thu, 25 Aug 2016 23:31:53 +0000 (19:31 -0400)]
i915g: fix incorrect gl_FragCoord value

On Intel Pineview M hardware, the i915 gallium driver doesn't output
the correct gl_FragCoord. It seems to always have an X coord of 0.0
and a Y coord of the window's height in pixels, e.g. 600.0f or such.

I believe this is a regression caused in part by this commit:
afa035031ff9e0c07a2297d864e46c76f7bfff58

The old behavior used the output at index zero, while the new behavior
uses actual zeroes. In the case of gl_FragCoord the output at index
zero happened to be the correct one, so the behavior appeared correct
although the code already had a bug.

Fixed by checking for I915_SEMANTIC_POS when setting up texCoords. If
the generic_mapping is I915_SEMANTIC_POS, look for the
TGSI_SEMANTIC_POSITION instead of a TGSI_SEMANTIC_GENERIC output.

https://bugs.freedesktop.org/show_bug.cgi?id=97477

Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Tested-by: Stéphane Marchesin <marcheu@chromium.org>
8 years agoRevert "mesa_glinterop: remove inclusion of GLX header"
Vinson Lee [Mon, 3 Oct 2016 22:16:30 +0000 (15:16 -0700)]
Revert "mesa_glinterop: remove inclusion of GLX header"

This reverts commit 8472045b16b3e4621553fe451a20a9ba9f0d44b6.

Conflicts:

include/GL/mesa_glinterop.h

This patch fixes this build error with GCC 4.4.

  Compiling src/glx/dri_common_interop.c ...
In file included from src/glx/dri_common_interop.c:33:
include/GL/mesa_glinterop.h:62: error: redefinition of typedef ‘GLXContext’
include/GL/glx.h:165: note: previous declaration of ‘GLXContext’ was here

Fixes: 8472045b16b3 ("mesa_glinterop: remove inclusion of GLX header")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96770
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
8 years agost/nine: More checks for GetRenderTargetData
Axel Davy [Sun, 2 Oct 2016 10:14:03 +0000 (12:14 +0200)]
st/nine: More checks for GetRenderTargetData

Fixes a wine test crash

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Patrick Rudolph <siro@das-labor.org>
8 years agost/nine: Add debug output for lost devices
Patrick Rudolph [Wed, 28 Sep 2016 18:11:34 +0000 (20:11 +0200)]
st/nine: Add debug output for lost devices

Add debug output to ease debugging.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Prevent crash in GetRenderTargetData
Patrick Rudolph [Wed, 28 Sep 2016 16:50:19 +0000 (18:50 +0200)]
st/nine: Prevent crash in GetRenderTargetData

Return error instead of crashing on source surfaces
with format D3DFMT_NULL.

Fix for issue #236.

Tested on Windows 7.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Set CLAMP_TO_EDGE on cubetextures
Patrick Rudolph [Sat, 24 Sep 2016 16:19:26 +0000 (18:19 +0200)]
st/nine: Set CLAMP_TO_EDGE on cubetextures

Wine tests show that cubetextures always use
PIPE_TEX_WRAP_CLAMP_TO_EDGE regardless of set
sampler states.

Fixes failing d3d9 wine test test_cube_wrap.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: handle possible failure of D3DWindowBuffer_create
Patrick Rudolph [Sat, 24 Sep 2016 09:34:33 +0000 (11:34 +0200)]
st/nine: handle possible failure of D3DWindowBuffer_create

Check for errors and pass them to the callers.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Assert on buffer creation failure
Patrick Rudolph [Sat, 24 Sep 2016 08:46:27 +0000 (10:46 +0200)]
st/nine: Assert on buffer creation failure

Add an assert to make sure buffer creation doesn't fail.
Add error handling in calling functions.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Use NineDevice9_CreateDepthStencilSurface in swapchain9
Patrick Rudolph [Fri, 23 Sep 2016 15:55:08 +0000 (17:55 +0200)]
st/nine: Use NineDevice9_CreateDepthStencilSurface in swapchain9

Replace custom code with NineDevice9_CreateDepthStencilSurface.
All functionality is given now.

8 years agost/nine: Fix check and remove useless code in swapchain9
Axel Davy [Sat, 1 Oct 2016 22:58:48 +0000 (00:58 +0200)]
st/nine: Fix check and remove useless code in swapchain9

The removed code was there for two reasons:
1) Allow DF16, DF24, INTZ to be used as depth buffer
for swapchain, if the driver doesn't support
PIPE_BIND_SAMPLER_VIEW for the underlying format
2) Set PIPE_BIND_SAMPLER_VIEW if possible, such that
if StretchRect is called on the depth texture, it is happy.

1) The reason these formats needed a workaround is because
the check flags for them in CheckDeviceFormat were incorrect,
which led applications to think the formats were valid for
swapchains, even if they weren't supported.
2) StretchRect limitations for depth buffers force
the resource_copy_region path, which should be fine without
PIPE_BIND_SAMPLER_VIEW.

Thus fix the check for 1), and remove the code.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Implement MSAA quality levels
Patrick Rudolph [Thu, 22 Sep 2016 15:03:17 +0000 (17:03 +0200)]
st/nine: Implement MSAA quality levels

Advertise quality levels:
Each supported multisample count matches to one quality level.
The application doesn't know how much samples each quality level has.
For that reason it's not possible to set the multisample mask.

Return errors on quality level missmatch.

Fixes several old games not having multisample support until now.

Fix for issue #73.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Prepare update_framebuffer for MS quality levels
Patrick Rudolph [Fri, 30 Sep 2016 16:15:31 +0000 (18:15 +0200)]
st/nine: Prepare update_framebuffer for MS quality levels

Compare resource's nr_samples instead of D3D multisample level.
Required for multisample quality levels to work correct.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Add additional error handling in CheckDeviceMultiSampleType
Patrick Rudolph [Fri, 30 Sep 2016 14:15:38 +0000 (16:15 +0200)]
st/nine: Add additional error handling in CheckDeviceMultiSampleType

Return one supported quality level in error cases.
Return error on invalid multisample count.

Fixes failing wine tests.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Fix compiler warning
Patrick Rudolph [Thu, 1 Sep 2016 16:07:55 +0000 (18:07 +0200)]
st/nine: Fix compiler warning

Use strict aliasing in SetPrivateData and struct pheader.
Casting char[1] to IUnknown** isn't allowed in strict aliasing.
Compute pointer to body by adding size of header to header pointer.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Remove resource9 {Set/Get/Free}PrivateData functions
Patrick Rudolph [Fri, 16 Sep 2016 15:33:52 +0000 (17:33 +0200)]
st/nine: Remove resource9 {Set/Get/Free}PrivateData functions

Remove {Set/Get/Free}PrivateData in resource9.
Functionality has been implement in IUnknown interface.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Remove volume9 {Set/Get/Free}PrivateData functions
Patrick Rudolph [Fri, 16 Sep 2016 15:32:20 +0000 (17:32 +0200)]
st/nine: Remove volume9 {Set/Get/Free}PrivateData functions

Remove {Set/Get/Free}PrivateData in volume9.
Functionality has been implement in IUnknown interface.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Switch {Set/Get/Free}PrivateData functions
Patrick Rudolph [Fri, 16 Sep 2016 15:29:47 +0000 (17:29 +0200)]
st/nine: Switch {Set/Get/Free}PrivateData functions

Switch {Set/Get/Free}PrivateData function to introduced IUnknown functions.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Implement {Set/Get/Free}PrivateData in iunknown
Patrick Rudolph [Fri, 16 Sep 2016 15:26:07 +0000 (17:26 +0200)]
st/nine: Implement {Set/Get/Free}PrivateData in iunknown

Implement {Set/Get/Free}PrivateData in iunknown to get rid
of duplicated code in resource9 and volume9.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Return device in NineSurface9_GetContainer
Patrick Rudolph [Fri, 16 Sep 2016 14:42:50 +0000 (16:42 +0200)]
st/nine: Return device in NineSurface9_GetContainer

According to MSDN the device is returned for surfaces that do
not have a regular container.

Such surfaces are:
OffscreenPlainSurface, DepthStencilSurface and RenderTarget

Tested and verified on Windows.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Allocate surface resources in surface ctor
Patrick Rudolph [Thu, 15 Sep 2016 18:28:17 +0000 (20:28 +0200)]
st/nine: Allocate surface resources in surface ctor

Allocate resources in surface ctor.
Allows to use statetracker internal memory accounting.

Fix for issue #231.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Signed-off-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Fix D3DFMT_NULL size
Axel Davy [Sun, 2 Oct 2016 09:58:41 +0000 (11:58 +0200)]
st/nine: Fix D3DFMT_NULL size

D3DFMT_NULL is mapped to PIPE_FORMAT_NONE.
Instead of relying on PIPE_FORMAT_NONE to
return a size, pick one.
The one picked is the same than Wine.

Signed-off-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Add debugging output
Patrick Rudolph [Wed, 14 Sep 2016 17:51:48 +0000 (19:51 +0200)]
st/nine: Add debugging output

Add DBG calls to NineTexture9_GetLevelDesc and
NineTexture9_GetSurfaceLevel to ease debugging.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
8 years agost/nine: Fix assert in NineUnknown_QueryInterface
Patrick Rudolph [Wed, 14 Sep 2016 17:50:16 +0000 (19:50 +0200)]
st/nine: Fix assert in NineUnknown_QueryInterface

Tests showed that is allowed to call this method on
object that have a zero refcount.
Required for issue #230.

Signed-off-by: Patrick Rudolph <siro@das-labor.org>
Reviewed-by: Axel Davy <axel.davy@ens.fr>