mesa.git
7 years agoradeonsi: change the bit-packing of LS out/TCS in data
Nicolai Hähnle [Wed, 12 Apr 2017 08:16:07 +0000 (10:16 +0200)]
radeonsi: change the bit-packing of LS out/TCS in data

Avoid conflicts when merging various VS state bits.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: emit VS_STATE register explicitly from si_draw_vbo
Nicolai Hähnle [Wed, 12 Apr 2017 08:00:18 +0000 (10:00 +0200)]
radeonsi: emit VS_STATE register explicitly from si_draw_vbo

We will merge other derived state information into this register.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: extract derived tess state emit to higher level
Nicolai Hähnle [Wed, 12 Apr 2017 07:40:28 +0000 (09:40 +0200)]
radeonsi: extract derived tess state emit to higher level

Especially with subsequent changes, this makes it easier to see the
sequence of state emits at the higher level.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradeonsi: drop support for TGSI_SEMANTIC_VERTEXID_NOBASE
Nicolai Hähnle [Wed, 12 Apr 2017 08:58:37 +0000 (10:58 +0200)]
radeonsi: drop support for TGSI_SEMANTIC_VERTEXID_NOBASE

It is unused.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agoradv: Add more trace points.
Bas Nieuwenhuizen [Wed, 12 Apr 2017 22:06:48 +0000 (00:06 +0200)]
radv: Add more trace points.

Most trace points happen after an operation, so add a trace point
at the start of the command buffer.

Furthermore, add one after a CmdUpdateBuffer using CP_DMA as that
didn't emit one yet.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Ignore CmdUpdateBuffer with size 0.
Bas Nieuwenhuizen [Wed, 12 Apr 2017 22:04:23 +0000 (00:04 +0200)]
radv: Ignore CmdUpdateBuffer with size 0.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Enable query inheritance.
Bas Nieuwenhuizen [Wed, 12 Apr 2017 21:17:14 +0000 (23:17 +0200)]
radv: Enable query inheritance.

timestamp and pipeline_statistics only do something on begin & end,
so they don't need any action.

Occlusion queries only do something to enable/disable and that
register is set nowhere else so that doesn't need extra support either.
(We technically should fix it to update the reg with the number of
 samples, but that hasn't happened yet, so we only change it to
 enable/disable counting)

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: enable variableMultisampleRate.
Bas Nieuwenhuizen [Wed, 12 Apr 2017 21:29:58 +0000 (23:29 +0200)]
radv: enable variableMultisampleRate.

This is only relevant with 0 attachments. In that case we do nothing
on subpass switch already, and the pipeline is the authoritative
source of the number of samples, so this shouldn't change anything.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agogallium/hud: set the dump file streams to line buffered
Edmondo Tommasina [Wed, 5 Apr 2017 19:03:55 +0000 (21:03 +0200)]
gallium/hud: set the dump file streams to line buffered

Flush the HUD value streams to the dump files after every newline.

v2: check that fopen succeeded  (Julien)

Reviewed-and-Tested-by: Julien Isorce <jisorce@oblong.com>
7 years agoradv: fix stencil regression since new addrlib import
Dave Airlie [Thu, 13 Apr 2017 04:36:26 +0000 (14:36 +1000)]
radv: fix stencil regression since new addrlib import

The addrlib import meant we'd return after we attempted
to setup the no stencil bits for an S8_UINT, now we break
and use the stencil level info when creating stencil DB
info.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: allocate thin textures as linear.
Dave Airlie [Thu, 13 Apr 2017 04:12:28 +0000 (14:12 +1000)]
radv: allocate thin textures as linear.

This is ported from radeonsi, and avoids the bug in the
addrlib code. This should probably be something addrlib
does for us, but for now this fixes the regression without
changing addrlib and aligns us with radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
7 years agoi965: add missing ir_unop_*/ir_binop_* in visit_leave()
Samuel Pitoiset [Tue, 11 Apr 2017 12:50:39 +0000 (14:50 +0200)]
i965: add missing ir_unop_*/ir_binop_* in visit_leave()

Fixes the following Clang warnings.

brw_fs_channel_expressions.cpp:219:12: warning: enumeration values 'ir_unop_ballot', 'ir_unop_read_first_invocation', and 'ir_binop_read_invocation' not handled in switch [-Wswitch]
   switch (expr->operation) {
           ^
1 warning generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agost/mesa: fix wrong comparison in update_framebuffer_state()
Samuel Pitoiset [Tue, 11 Apr 2017 12:19:19 +0000 (14:19 +0200)]
st/mesa: fix wrong comparison in update_framebuffer_state()

state_tracker/st_atom_framebuffer.c:208:27: warning: comparison of constant 4294967295 with expression of type 'uint16_t' (aka 'unsigned short') is always false [-Wtautological-constant-out-of-range-compare]
   if (framebuffer->width == UINT_MAX)
       ~~~~~~~~~~~~~~~~~~ ^  ~~~~~~~~
state_tracker/st_atom_framebuffer.c:210:28: warning: comparison of constant 4294967295 with expression of type 'uint16_t' (aka 'unsigned short') is always false [-Wtautological-constant-out-of-range-compare]
   if (framebuffer->height == UINT_MAX)
       ~~~~~~~~~~~~~~~~~~~ ^  ~~~~~~~~
2 warnings generated.

Fixes: eb0fd0e5f86 ("gallium: decrease the size of pipe_framebuffer_state - 96 -> 80 bytes")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agoradeon: fix duplicate 'const' specifier
Samuel Pitoiset [Tue, 11 Apr 2017 12:55:12 +0000 (14:55 +0200)]
radeon: fix duplicate 'const' specifier

Fixes the following Clang warning.

In file included from radeon_debug.c:32:
./radeon_common_context.h:500:19: warning: duplicate 'const' declaration specifier [-Wduplicate-decl-specifier]
extern const char const *radeonVendorString;

v2: - do not remove the duplicate 'const' qualifier, fix it

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
7 years agosvga: remove unused vmw_dri1_intersect_src_bbox()
Samuel Pitoiset [Tue, 11 Apr 2017 12:33:13 +0000 (14:33 +0200)]
svga: remove unused vmw_dri1_intersect_src_bbox()

Fixes the following Clang warning.

vmw_screen_dri.c:130:1: warning: unused function 'vmw_dri1_intersect_src_bbox' [-Wunused-function]
vmw_dri1_intersect_src_bbox(struct drm_clip_rect *dst,
^
1 warning generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agollvmpipe: remove unused subpixel_snap() and fixed_to_float()
Samuel Pitoiset [Tue, 11 Apr 2017 12:30:42 +0000 (14:30 +0200)]
llvmpipe: remove unused subpixel_snap() and fixed_to_float()

Fixes the following Clang warnings.

lp_setup_tri.c:55:1: warning: unused function 'subpixel_snap' [-Wunused-function]
subpixel_snap(float a)
^
lp_setup_tri.c:61:1: warning: unused function 'fixed_to_float' [-Wunused-function]
fixed_to_float(int a)
^

v2: - do not remove subpixel_snap() (use !PIPE_ARCH_SSE instead)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
7 years agosoftpipe: remove unused sp_exec_fragment_shader()
Samuel Pitoiset [Tue, 11 Apr 2017 12:42:39 +0000 (14:42 +0200)]
softpipe: remove unused sp_exec_fragment_shader()

Fixes the following Clang warning.

sp_fs_exec.c:56:1: warning: unused function 'sp_exec_fragment_shader' [-Wunused-function]
sp_exec_fragment_shader(const struct sp_fragment_shader_variant *var)
^
1 warning generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agosoftpipe: remove unused quad_shade_stage()
Samuel Pitoiset [Tue, 11 Apr 2017 12:32:19 +0000 (14:32 +0200)]
softpipe: remove unused quad_shade_stage()

Fixes the following Clang warning.

sp_quad_fs.c:60:1: warning: unused function 'quad_shade_stage' [-Wunused-function]
quad_shade_stage(struct quad_stage *qs)
^
1 warning generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agosoftpipe: remove unused get_texel_quad_2d()
Samuel Pitoiset [Tue, 11 Apr 2017 12:29:35 +0000 (14:29 +0200)]
softpipe: remove unused get_texel_quad_2d()

Fixes the following Clang warning.

sp_tex_sample.c:802:1: warning: unused function 'get_texel_quad_2d' [-Wunused-function]
get_texel_quad_2d(const struct sp_sampler_view *sp_sview,
^
  CC       sp_tile_cache.lo
1 warning generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agotrace: remove some unused trace_dump_tag*() functions
Samuel Pitoiset [Tue, 11 Apr 2017 12:13:12 +0000 (14:13 +0200)]
trace: remove some unused trace_dump_tag*() functions

Fixes the following Clang warnings.

tr_dump.c:137:1: warning: unused function 'trace_dump_tag' [-Wunused-function]
trace_dump_tag(const char *name)
^
tr_dump.c:168:1: warning: unused function 'trace_dump_tag_begin2' [-Wunused-function]
trace_dump_tag_begin2(const char *name,
^
tr_dump.c:187:1: warning: unused function 'trace_dump_tag_begin3' [-Wunused-function]
trace_dump_tag_begin3(const char *name,
^
  CC       tr_texture.lo
3 warnings generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agodraw: remove unused wideline_stage()
Samuel Pitoiset [Tue, 11 Apr 2017 12:34:22 +0000 (14:34 +0200)]
draw: remove unused wideline_stage()

Fixes the following Clang warning.

draw/draw_pipe_wide_line.c:48:38: warning: unused function 'wideline_stage' [-Wunused-function]
static inline struct wideline_stage *wideline_stage( struct draw_stage *stage )
                                     ^
1 warning generated.

v2: - remove commented code (Roland Scheidegger)
v3: - remove half_line_width in the struct

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
7 years agodraw: remove unused overflow()
Samuel Pitoiset [Tue, 11 Apr 2017 12:10:09 +0000 (14:10 +0200)]
draw: remove unused overflow()

Fixes the following Clang warning.

draw/draw_pipe_vbuf.c:102:1: warning: unused function 'overflow' [-Wunused-function]
overflow( void *map, void *ptr, unsigned bytes, unsigned bufsz )
^
1 warning generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agomesa: remove some unused functions in the perf monitor area
Samuel Pitoiset [Tue, 11 Apr 2017 12:05:17 +0000 (14:05 +0200)]
mesa: remove some unused functions in the perf monitor area

Fixes the following Clang warnings.

main/performance_monitor.c:157:1: warning: unused function 'index_to_queryid' [-Wunused-function]
index_to_queryid(GLuint index)
^
main/performance_monitor.c:163:1: warning: unused function 'queryid_valid' [-Wunused-function]
queryid_valid(const struct gl_context *ctx, GLuint queryid)
^
main/performance_monitor.c:169:1: warning: unused function 'counterid_to_index' [-Wunused-function]
counterid_to_index(GLuint counterid)
^
3 warnings generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agomesa: remove unused clamp_float_to_uint() and clamp_half_to_uint()
Samuel Pitoiset [Tue, 11 Apr 2017 12:03:00 +0000 (14:03 +0200)]
mesa: remove unused clamp_float_to_uint() and clamp_half_to_uint()

Fixes the following Clang warnings.

main/pack.c:470:1: warning: unused function 'clamp_float_to_uint' [-Wunused-function]
clamp_float_to_uint(GLfloat f)
^
main/pack.c:477:1: warning: unused function 'clamp_half_to_uint' [-Wunused-function]
clamp_half_to_uint(GLhalfARB h)
^
2 warnings generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agomesa: remove unused _mesa_unmarshal_BindBufferBase()
Samuel Pitoiset [Tue, 11 Apr 2017 12:01:51 +0000 (14:01 +0200)]
mesa: remove unused _mesa_unmarshal_BindBufferBase()

Fixes the following Clang warning.

main/marshal.c:209:1: warning: unused function '_mesa_unmarshal_BindBufferBase' [-Wunused-function]
_mesa_unmarshal_BindBufferBase(struct gl_context *ctx, const struct marshal_cmd_BindBufferBase *cmd)
^
1 warning generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agovirgl: add missing PIPE_CAP_DOUBLES
Samuel Pitoiset [Tue, 11 Apr 2017 11:54:44 +0000 (13:54 +0200)]
virgl: add missing PIPE_CAP_DOUBLES

Fixes the following Clang warning.

virgl_screen.c:60:12: warning: enumeration value 'PIPE_CAP_DOUBLES' not handled in switch [-Wswitch]
   switch (param) {
           ^
1 warning generated.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agoglsl: simplify apply_image_qualifier_to_variable()
Samuel Pitoiset [Wed, 12 Apr 2017 12:36:32 +0000 (14:36 +0200)]
glsl: simplify apply_image_qualifier_to_variable()

This removes one level of indentation and will improve readability
for bindless images.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agoglsl: add validate_fragment_flat_interpolation_input()
Samuel Pitoiset [Wed, 12 Apr 2017 10:47:47 +0000 (12:47 +0200)]
glsl: add validate_fragment_flat_interpolation_input()

Requested by Timothy Arceri.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agonvc0: Enable ARB_shader_ballot on Kepler+
Boyan Ding [Mon, 10 Apr 2017 14:56:05 +0000 (22:56 +0800)]
nvc0: Enable ARB_shader_ballot on Kepler+

readInvocationARB() and readFirstInvocationARB() need SHFL.IDX
instruction which is introduced in Kepler.

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agonvc0/ir: Implement TGSI_OPCODE_BALLOT and TGSI_OPCODE_READ_*
Boyan Ding [Mon, 10 Apr 2017 14:56:04 +0000 (22:56 +0800)]
nvc0/ir: Implement TGSI_OPCODE_BALLOT and TGSI_OPCODE_READ_*

v2: Check if each channel is masked in TGSI_OPCODE_BALLOT (Ilia Mirkin)

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agonvc0/ir: Implement TGSI_SEMANTIC_SUBGROUP_*
Boyan Ding [Mon, 10 Apr 2017 14:56:03 +0000 (22:56 +0800)]
nvc0/ir: Implement TGSI_SEMANTIC_SUBGROUP_*

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agonvc0/ir: Add SV_LANEMASK_* system values.
Boyan Ding [Mon, 10 Apr 2017 14:56:02 +0000 (22:56 +0800)]
nvc0/ir: Add SV_LANEMASK_* system values.

v2: Add name strings in nv50_ir_print.cpp (Ilia Mirkin)

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agonvc0/ir: Allow 0/1 immediate value as source of OP_VOTE
Boyan Ding [Mon, 10 Apr 2017 14:56:01 +0000 (22:56 +0800)]
nvc0/ir: Allow 0/1 immediate value as source of OP_VOTE

Implementation of readFirstInvocationARB() on nvidia hardware needs a
ballotARB(true) used to decide the first active thread. This expressed
in gm107 asm as (supposing output is $r0):
vote any $r0 0x1 0x1

To model the always true input, which corresponds to the second 0x1
above, we make OP_VOTE accept immediate value 0/1 and emit "0x1" and
"not 0x1" in the src field respectively.

v2: Make sure that asImm() is not NULL (Samuel Pitoiset)

v3: (Ilia Mirkin)
Make the handling more symmetric with predicate version in gm107
Use i->getSrc(s)

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agogk110/ir: Emit OP_SHFL
Boyan Ding [Mon, 10 Apr 2017 14:56:00 +0000 (22:56 +0800)]
gk110/ir: Emit OP_SHFL

v2: Make sure that asImm() is not NULL (Samuel Pitoiset)

v3: Check the range of immediate in OP_SHFL (Ilia Mirkin)

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agonvc0/ir: Emit OP_SHFL
Boyan Ding [Mon, 10 Apr 2017 14:55:59 +0000 (22:55 +0800)]
nvc0/ir: Emit OP_SHFL

v2: (Samuel Pitoiset)
Add an assertion to check if the target is Kepler
Make sure that asImm() is not NULL

v3: (Ilia Mirkin)
Check the range of immediate value of OP_SHFL
Use the new setPDSTL API

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agonvc0/ir: Properly handle a "split form" of predicate destination
Boyan Ding [Mon, 10 Apr 2017 14:55:58 +0000 (22:55 +0800)]
nvc0/ir: Properly handle a "split form" of predicate destination

GF100's ISA encoding has a weird form of predicate destination where its
3 bits are split across whole the instruction. Use a dedicated setPDSTL
function instead of original defId which is incorrect in this case.

v2: (Ilia Mirkin)
Change API of setPDSTL() to handle cases of no output
Fix setting of the highest bit in setPDSTL()

Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agogm107/ir: Emit third src 'bound' and optional predicate output of SHFL
Boyan Ding [Mon, 10 Apr 2017 14:55:57 +0000 (22:55 +0800)]
gm107/ir: Emit third src 'bound' and optional predicate output of SHFL

v2: Emit the original hard-coded 0x1c03 when OP_SHFL is used in gm107's
    lowering (Samuel Pitoiset)

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agoclover: Fix build against clang SVN >= r299965
Michel Dänzer [Wed, 12 Apr 2017 08:17:34 +0000 (17:17 +0900)]
clover: Fix build against clang SVN >= r299965

clang::LangAS::Offset is gone, the behaviour is as if it was 0.

v2: Introduce and use clover::llvm::compat::lang_as_offset (Francisco
    Jerez)

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
7 years agost/mesa: add some _mesa_is_winsys_fbo() assertions
Brian Paul [Tue, 11 Apr 2017 03:11:55 +0000 (21:11 -0600)]
st/mesa: add some _mesa_is_winsys_fbo() assertions

A few functions related to FBOs/renderbuffers should only be used with
window-system buffers, not user-created FBOs.  Assert for that.
Add additional comments.  No piglit regressions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/mesa: minor optimization in st_DrawBuffers()
Brian Paul [Tue, 11 Apr 2017 03:09:41 +0000 (21:09 -0600)]
st/mesa: minor optimization in st_DrawBuffers()

We only do on-demand renderbuffer allocation for window-system FBOs,
not user-created FBOs.  So put the loop inside a conditional.

Plus, add some comments.  No piglit regressions.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agomesa/st: only update samplers for stages that have changed
Timothy Arceri [Tue, 11 Apr 2017 01:05:22 +0000 (11:05 +1000)]
mesa/st: only update samplers for stages that have changed

Might help reduce cpu for some apps that use sso.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
7 years agost/mesa: Fix missing-braces warning.
Vinson Lee [Wed, 22 Mar 2017 23:21:41 +0000 (16:21 -0700)]
st/mesa: Fix missing-braces warning.

  CXX      state_tracker/st_glsl_to_nir.lo
state_tracker/st_glsl_to_nir.cpp:250:57: warning: suggest braces around initialization of subobject [-Wmissing-braces]
      nir_lower_wpos_ytransform_options wpos_options = {0};
                                                        ^
                                                        {}

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
7 years agoradv: Disable primitive restart for non-indexed draws
Alex Smith [Wed, 12 Apr 2017 08:20:42 +0000 (09:20 +0100)]
radv: Disable primitive restart for non-indexed draws

According to the Vulkan spec, VkPipelineInputAssemblyStateCreateInfo's
primitiveRestartEnable flag should only apply to indexed draws, however
it was being enabled regardless of the type of draw. This could cause
problems for non-indexed draws with >=65535 vertices if the previous
indexed draw used 16-bit indices.

Fixes corruption of the credits text in Mad Max.

v2: Reset primitive restart state after executing a secondary command
    buffer.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
7 years agoanv: Only define wsi_cbs when VK_USE_PLATFORM_WAYLAND_KHR defined
Matt Turner [Wed, 12 Apr 2017 18:00:39 +0000 (11:00 -0700)]
anv: Only define wsi_cbs when VK_USE_PLATFORM_WAYLAND_KHR defined

7 years agoRevert "r600g: get rid of dummy pixel shader"
Marek Olšák [Wed, 12 Apr 2017 15:45:30 +0000 (17:45 +0200)]
Revert "r600g: get rid of dummy pixel shader"

This reverts commit 61e47d92c5196bf0240e322bb1b9d305836559e3.

It causes a hang on RS780.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100663

7 years agomesa: fix memory leak in arb_fragment_program
Bartosz Tomczyk [Sun, 9 Apr 2017 16:37:13 +0000 (18:37 +0200)]
mesa: fix memory leak in arb_fragment_program

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agoradv: Hash the immutable samplers.
Bas Nieuwenhuizen [Tue, 11 Apr 2017 22:40:36 +0000 (00:40 +0200)]
radv: Hash the immutable samplers.

Since the shader code can include them.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Use an offset instead of pointers for immutable samplers.
Bas Nieuwenhuizen [Tue, 11 Apr 2017 22:37:06 +0000 (00:37 +0200)]
radv: Use an offset instead of pointers for immutable samplers.

Makes more sense when we hash the layout for the pipeline cache.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Stop shadowing the result in radv_GetQueryPoolResults.
Bas Nieuwenhuizen [Tue, 11 Apr 2017 22:45:51 +0000 (00:45 +0200)]
radv: Stop shadowing the result in radv_GetQueryPoolResults.

The outer result was referred to, which meant bugs.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Return VK_NOT_READY if the query results are not available.
Bas Nieuwenhuizen [Tue, 11 Apr 2017 21:56:42 +0000 (23:56 +0200)]
radv: Return VK_NOT_READY if the query results are not available.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: 8475a14302e ("radv: Implement pipeline statistics queries.")
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
7 years agoradv: Set query availability bit even if we don't wait.
Bas Nieuwenhuizen [Tue, 11 Apr 2017 21:54:58 +0000 (23:54 +0200)]
radv: Set query availability bit even if we don't wait.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: 8475a14302e ("radv: Implement pipeline statistics queries.")
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
7 years agomesa: avoid NULL ptr in prog parameter name
Gregory Hainaut [Tue, 11 Apr 2017 20:29:17 +0000 (22:29 +0200)]
mesa: avoid NULL ptr in prog parameter name

Context: _mesa_add_parameter is sometimes[0] called with a
NULL name as a mean of an unnamed parameter.

Allowing NULL pointer as a name means that it must be NULL checked
each access. So far it isn't always[1] true.

Parameter name is only used for debug purpose (printf) and
to lookup the index/location of the program by the application.

Conclusion, there is no valid reason to use a NULL pointer instead of
an empty string. So it was decided to use an empty string which avoid all
issues related to NULL pointer

[0]: texture gather offsets glsl opcode and st_init_atifs_prog
[1]: at least shader cache, st_nir_lookup_parameter_index and some printfs

Issue found by piglit 'texturegatheroffsets' tests on Nouveau

v4: new patch based on Nicolai/Timothy/ilia discussion
Signed-off-by: Gregory Hainaut <gregory.hainaut@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agoi965/drm: Use bools for a few flags.
Kenneth Graunke [Tue, 11 Apr 2017 07:04:29 +0000 (00:04 -0700)]
i965/drm: Use bools for a few flags.

These one bit values are booleans.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965/drm: Make brw_bo_alloc_tiled flags parameter 32-bit.
Kenneth Graunke [Tue, 11 Apr 2017 07:02:35 +0000 (00:02 -0700)]
i965/drm: Make brw_bo_alloc_tiled flags parameter 32-bit.

unsigned long is a terrible type for a bitfield - if you need fewer
than 32 bits, it wastes 4 bytes.  If you need more, things break on
32-bit builds.  Just use unsigned.

Even that's a bit ridiculous as we only have one flag today.
Still, it's at least somewhat better.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965/drm: Make BO size a uint64_t rather than unsigned long.
Kenneth Graunke [Tue, 11 Apr 2017 06:10:04 +0000 (23:10 -0700)]
i965/drm: Make BO size a uint64_t rather than unsigned long.

The drm_i915_gem_create ioctl structure uses a __u64 for the size,
so we should probably use uint64_t to match.  In theory, we could
probably have a BO larger than 4GB, using a 48-bit PPGTT - it just
wouldn't be mappable in the CPU's 32-bit address space.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965/drm: Make alignment parameter a uint64_t.
Kenneth Graunke [Tue, 11 Apr 2017 06:55:21 +0000 (23:55 -0700)]
i965/drm: Make alignment parameter a uint64_t.

Theoretically, with a 48-bit address space, we could have buffers
with an alignment of >= 4GB.  It's a bit silly, but the exec_object
structs (drm_i915_gem_exec_object2) use a __u64 for this, so we may
as well use the same type as the kernel API.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965/drm: Make stride/pitch a uint32_t.
Kenneth Graunke [Tue, 11 Apr 2017 06:08:23 +0000 (23:08 -0700)]
i965/drm: Make stride/pitch a uint32_t.

struct drm_i915_gem_set_tiling's stride field is a __u32.
intel_mipmap_tree::stride is a uint32_t.  Using unsigned long just
doesn't make sense.  Switching also lets us drop many pointless
locals that only existed to deal with the type mismatch.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965/drm: Fix types for pwrite/pread fields.
Kenneth Graunke [Tue, 11 Apr 2017 06:00:24 +0000 (23:00 -0700)]
i965/drm: Fix types for pwrite/pread fields.

The ioctl structs contain __u64 offset and size fields, so make them
uint64_t rather than unsigned long.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agoi965/drm: Make brw_bo_alloc_tiled take tiling by value, not pointer.
Kenneth Graunke [Tue, 11 Apr 2017 06:31:20 +0000 (23:31 -0700)]
i965/drm: Make brw_bo_alloc_tiled take tiling by value, not pointer.

For some reason we passed tiling by pointer, through several layers,
even though the functions only read the initial value, and never
actually change it.  We even had a do-while loop that executed until
the tiling mode matched - except it always did, so it only ran once.
We then had bogus error handling in case it changed the tiling mode
to something nonsensical...which it never did.

Drop all this nonsense.

Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
7 years agomesa/st: remove _mesa_get_fallback_texture() calls
Timothy Arceri [Tue, 11 Apr 2017 04:30:15 +0000 (14:30 +1000)]
mesa/st: remove _mesa_get_fallback_texture() calls

These calls look like leftover from fallback texture support first
being added to the st in 8f6d9e12be0be and then later being added
to core mesa in 00e203fe17cbf21.

The piglit test fp-incomplete-tex continues to work with this
change.

Reviewed-by: Brian Paul <brianp@vmware.com>
7 years agomesa: use pre_hashed version of search for the mesa hash table
Timothy Arceri [Mon, 10 Apr 2017 12:21:37 +0000 (22:21 +1000)]
mesa: use pre_hashed version of search for the mesa hash table

The key is just an unsigned int so there is never any real hashing
done.

Reviewed-by: Eric Anholt <eric@anholt.net>
7 years agoswr: [rasterizer core] Disable 8x2 tile backend
Tim Rowley [Fri, 7 Apr 2017 21:51:42 +0000 (16:51 -0500)]
swr: [rasterizer core] Disable 8x2 tile backend

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer common] Add _simd_testz_si alias
Tim Rowley [Fri, 7 Apr 2017 21:31:36 +0000 (16:31 -0500)]
swr: [rasterizer common] Add _simd_testz_si alias

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer archrast] Fix archrast for MSVC 2017 compiler
Tim Rowley [Fri, 7 Apr 2017 16:41:25 +0000 (11:41 -0500)]
swr: [rasterizer archrast] Fix archrast for MSVC 2017 compiler

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer jitter] Remove unused function
Tim Rowley [Fri, 7 Apr 2017 15:58:38 +0000 (10:58 -0500)]
swr: [rasterizer jitter] Remove unused function

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer jitter] Remove HAVE_LLVM tests supporting llvm < 3.8
Tim Rowley [Fri, 7 Apr 2017 12:57:11 +0000 (07:57 -0500)]
swr: [rasterizer jitter] Remove HAVE_LLVM tests supporting llvm < 3.8

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer common/core] Fix 32-bit windows build
Tim Rowley [Fri, 7 Apr 2017 09:37:25 +0000 (04:37 -0500)]
swr: [rasterizer common/core] Fix 32-bit windows build

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer core] Fix unused variable warnings
Tim Rowley [Fri, 7 Apr 2017 03:11:45 +0000 (22:11 -0500)]
swr: [rasterizer core] Fix unused variable warnings

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer core] Code formating change
Tim Rowley [Thu, 6 Apr 2017 23:21:54 +0000 (18:21 -0500)]
swr: [rasterizer core] Code formating change

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer core] SIMD16 Frontend WIP - PA
Tim Rowley [Thu, 6 Apr 2017 21:37:03 +0000 (16:37 -0500)]
swr: [rasterizer core] SIMD16 Frontend WIP - PA

Fix PA NextPrim for SIMD8 on SIMD16.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer core] SIMD16 Frontend WIP - Clipper
Tim Rowley [Thu, 6 Apr 2017 20:22:55 +0000 (15:22 -0500)]
swr: [rasterizer core] SIMD16 Frontend WIP - Clipper

Implement widened clipper for SIMD16.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer core] Multisample sample position setup change
Tim Rowley [Sat, 1 Apr 2017 01:33:43 +0000 (20:33 -0500)]
swr: [rasterizer core] Multisample sample position setup change

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoswr: [rasterizer core] Reduce templates to speed compile
Tim Rowley [Fri, 31 Mar 2017 21:50:40 +0000 (16:50 -0500)]
swr: [rasterizer core] Reduce templates to speed compile

Quick patch to remove some unused template params to cut down
rasterizer compile time.

Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
7 years agoi965/fs: Take into account lower frequency of conditional blocks in spilling cost...
Francisco Jerez [Mon, 10 Apr 2017 00:28:58 +0000 (17:28 -0700)]
i965/fs: Take into account lower frequency of conditional blocks in spilling cost heuristic.

The individual branches of an if/else/endif construct will be executed
some unknown number of times between 0 and 1 relative to the parent
block.  Use some factor in between as weight while approximating the
cost of spill/fill instructions within a conditional if-else branch.
This favors spilling registers used within conditional branches which
are likely to be executed less frequently than registers used at the
top level.

Improves the framerate of the SynMark2 OglCSDof benchmark by ~1.9x on
my SKL GT4e.  Should have a comparable effect on other platforms.  No
significant regressions.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
7 years agoswr: return true for PIPE_CAP_DOUBLES
Tim Rowley [Tue, 11 Apr 2017 16:50:23 +0000 (11:50 -0500)]
swr: return true for PIPE_CAP_DOUBLES

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
7 years agoi965: Set kernel features before computing max GL version.
Kenneth Graunke [Tue, 11 Apr 2017 15:33:20 +0000 (08:33 -0700)]
i965: Set kernel features before computing max GL version.

We check these bitfields when computing the Haswell max GL version.
We need to set them ahead of time, or they won't exist, and all our
checks will fail.  That sets the max core profile GL version to 4.2.

This introduces the bizarre situation where asking for a GL context
with version 4.3+ fails, but asking for a GL core profile context
with version <= 4.2 actually promotes you a 4.5 context.

GLX_MESA_query_renderer also reported the bogus 4.2 value.
Now it shows 4.5.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Reported-and-tested-by: Rafael Ristovski <rafael.ristovski@gmail.com>
7 years agoanv: remove needless VALGRIND_MAKE_MEM_DEFINED
Juan A. Suarez Romero [Tue, 11 Apr 2017 11:15:31 +0000 (13:15 +0200)]
anv: remove needless VALGRIND_MAKE_MEM_DEFINED

This is already invoked in the following VG_NOACCESS_READ() call.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
7 years agoetnaviv: enable TS, but disable autodisable
Lucas Stach [Mon, 21 Nov 2016 11:32:15 +0000 (12:32 +0100)]
etnaviv: enable TS, but disable autodisable

Autodisable seems to cause missed rendering in some cases, but
otherwise TS seems to work properly.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
7 years agoetnaviv: enable TS also on sampler resources
Lucas Stach [Mon, 21 Nov 2016 11:29:04 +0000 (12:29 +0100)]
etnaviv: enable TS also on sampler resources

Fixes a performance issue with imported winsys buffers as those are
marked with binding sampler view.

This might require a TS flush on single pipe chips that directly
sample from the rendered buffer, but otherwise seems to work fine.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
7 years agoetnaviv: align TS surface size to number of pixel pipes
Lucas Stach [Mon, 21 Nov 2016 11:27:47 +0000 (12:27 +0100)]
etnaviv: align TS surface size to number of pixel pipes

The TS surface gets cleared by a tiled RS fill. If the chip has
more than 1 pixel pipe the size of the TS surface needs to be
aligned so that each pipe address matches a tile start, otherwise
the RS will hang.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
7 years agoetnaviv: avoid using invalid TS
Lucas Stach [Mon, 21 Nov 2016 11:25:29 +0000 (12:25 +0100)]
etnaviv: avoid using invalid TS

The TS is only valid after it has been initialized by a fast
clear, so it should not be taken into account when blitting
resources that haven't been cleared. Also the blit itself
invalidates the destination TS, as it's not updated and will
retain data from the previous rendering after the blit.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
7 years agoglsl: use the BA1 macro for textureQueryLevels()
Samuel Pitoiset [Mon, 10 Apr 2017 17:23:17 +0000 (19:23 +0200)]
glsl: use the BA1 macro for textureQueryLevels()

For both consistency and new bindless sampler types.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agoglsl: use the BA1 macro for textureSamples()
Samuel Pitoiset [Mon, 10 Apr 2017 17:23:16 +0000 (19:23 +0200)]
glsl: use the BA1 macro for textureSamples()

For both consistency and new bindless sampler types.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agoglsl: use the BA1 macro for textureCubeArrayShadow()
Samuel Pitoiset [Mon, 10 Apr 2017 17:23:15 +0000 (19:23 +0200)]
glsl: use the BA1 macro for textureCubeArrayShadow()

For both consistency and new bindless sampler types.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
7 years agoradv: Implement pipeline statistics queries.
Bas Nieuwenhuizen [Mon, 10 Apr 2017 20:20:19 +0000 (22:20 +0200)]
radv: Implement pipeline statistics queries.

The devil is in the shader again, otherwise this is
fairly straightforward.

The CTS contains no pipeline statistics copy to buffer
testcases, so I did a basic smoketest.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Let count be dynamic in radv_break_on_count.
Bas Nieuwenhuizen [Mon, 10 Apr 2017 21:54:51 +0000 (23:54 +0200)]
radv: Let count be dynamic in radv_break_on_count.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Rename query pipeline/set layout.
Bas Nieuwenhuizen [Mon, 10 Apr 2017 19:49:48 +0000 (21:49 +0200)]
radv: Rename query pipeline/set layout.

For using them with both occlusion and pipeline statistics queries.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Use VK_WHOLE_SIZE for the query buffer bindings.
Bas Nieuwenhuizen [Mon, 10 Apr 2017 19:46:07 +0000 (21:46 +0200)]
radv: Use VK_WHOLE_SIZE for the query buffer bindings.

The buffer sizes are specified just a few lines earlier, so don't
repeat ourselves.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Use a shader for occlusion CmdCopyQueryPoolResults.
Bas Nieuwenhuizen [Sun, 9 Apr 2017 20:35:32 +0000 (22:35 +0200)]
radv: Use a shader for occlusion CmdCopyQueryPoolResults.

Use the new occlusion query copy shader.

We don't use the shader for the waiting as a polling loop ineracts badly
with having caching enabled. I noticed on my GPU (Tonga) that the values
are written out in order, so I just use a WAIT_REG_MEM on the last value.

If it turns out other chips don't do that we may need to look a bit more
into this. Having 8 WAIT_REG_MEM packets per query doesn't sound ideal.

This also restricts the availability word in the pool to timestamp queries
only, as occlusion queries don't use it, and pipeline statistic queries
likely won't either.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoradv: Add occlusion query shader.
Bas Nieuwenhuizen [Sun, 26 Feb 2017 17:21:01 +0000 (18:21 +0100)]
radv: Add occlusion query shader.

Adds a shader for writing occlusion query results to a buffer, as the
CP packet isn't support on SI or secondary buffers, and doesn't handle
the availability bit (or partial results) nor truncation to 32-bit.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
7 years agoi965: Fix wonky indentation left by brw_bo_alloc_tiled rename.
Kenneth Graunke [Tue, 11 Apr 2017 06:23:13 +0000 (23:23 -0700)]
i965: Fix wonky indentation left by brw_bo_alloc_tiled rename.

7 years agonouveau: when mapping a persistent buffer, synchronize on former xfers
Ilia Mirkin [Sat, 8 Apr 2017 22:31:35 +0000 (18:31 -0400)]
nouveau: when mapping a persistent buffer, synchronize on former xfers

If the buffer is being used, we should wait for those uses to be
complete before returning the map.

Fixes: GL45-CTS.direct_state_access.buffers_functional
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
7 years agonvc0: increase texture buffer object alignment to 256 for pre-GM107
Ilia Mirkin [Sat, 8 Apr 2017 18:56:16 +0000 (14:56 -0400)]
nvc0: increase texture buffer object alignment to 256 for pre-GM107

We currently don't pass the low byte of the address via the surface
info, so in order to work with images, these have to implicitly be
aligned to 256. The proprietary driver also doesn't go out of its way to
provide lower alignment.

Fixes GL45-CTS.texture_buffer.texture_buffer_texture_buffer_range

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
7 years agomesa: fix typo and add assert() to _mesa_attach_renderbuffer_without_ref()
Timothy Arceri [Mon, 10 Apr 2017 23:57:45 +0000 (09:57 +1000)]
mesa: fix typo and add assert() to _mesa_attach_renderbuffer_without_ref()

This function should only be used with a "freshly created" renderbuffer
so assert RefCount is 1.

7 years agoi965/drm: Add stall warnings when mapping or waiting on BOs.
Kenneth Graunke [Mon, 10 Apr 2017 06:14:56 +0000 (23:14 -0700)]
i965/drm: Add stall warnings when mapping or waiting on BOs.

This restores the performance warnings removed in:

    i965: Drop brw_bo_map[_gtt] wrappers which issue perf warnings.

but adds them for nearly all BO mapping, and also for wait_rendering.

Because we add this to the core bufmgr, we automatically get stall
warnings in all callers, unlike before where only a few callsites used
the wrappers that gave stall warnings.

We also do it a bit differently: we simply measure how long set_domain
takes (the part that stalls), and complain if it's more than 0.01 ms.
We don't bother calling brw_bo_busy(), and we don't measure the mmap
time (which doesn't stall).  This should be more accurate.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
7 years agoi965/drm: Make a set_domain() helper function.
Kenneth Graunke [Mon, 10 Apr 2017 05:58:57 +0000 (22:58 -0700)]
i965/drm: Make a set_domain() helper function.

Less boilerplate.

Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
7 years agoi965/batch: Ensure we use a consistent offset in relocs
Daniel Vetter [Thu, 6 Apr 2017 07:13:47 +0000 (09:13 +0200)]
i965/batch: Ensure we use a consistent offset in relocs

In theory gcc is free to re-load them, and if a concurrent
execbuf races and updates bo->offset64 then we have a problem:
execbuffer api requires that the ->presumed_offset and the one
we used for the reloc matches. It does not require that the value
is sensible, which means no locks needed, just a consistent load.

Ken said his next series will nuke this, so just hand-roll the
kernel's READ_ONCE idea inline.

FIXME: Most callers of brw_emit_reloc recompute the relocation
themselves, which means this doesn't really fix the race. But the long
term plan is to move to per-context relocation handling, which will
fix this all properly. So leave this for now as just a reminder.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965/bufmgr: Garbage-collect vma cache/pruning
Daniel Vetter [Thu, 6 Apr 2017 06:48:08 +0000 (08:48 +0200)]
i965/bufmgr: Garbage-collect vma cache/pruning

This was done because the kernel has 1 global address space, shared
with all render clients, for gtt mmap offsets, and that address space
was only 32bit on 32bit kernels.

This was fixed  in

commit 440fd5283a87345cdd4237bdf45fb01130ea0056
Author: Thierry Reding <treding@nvidia.com>
Date:   Fri Jan 23 09:05:06 2015 +0100

    drm/mm: Support 4 GiB and larger ranges

which shipped in 4.0. Of course you still want to limit the bo cache
to a reasonable size on 32bit apps to avoid ENOMEM, but that's better
solved by tuning the cache a bit. On 64bit, this was never an issue.

On top, mesa never set this, so it's all dead code. Collect an trash it.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965/bufmgr: Remove some reuse functions
Daniel Vetter [Thu, 6 Apr 2017 06:28:51 +0000 (08:28 +0200)]
i965/bufmgr: Remove some reuse functions

is_reusable was needed by uxa because it couldn't keep track of its
scanout buffers and used this as a proxy. Disabling reuse is a silly
idea, we set this once at start. Remove both.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
7 years agoi965/bufmgr: remove start_gtt_access
Daniel Vetter [Thu, 6 Apr 2017 06:27:47 +0000 (08:27 +0200)]
i965/bufmgr: remove start_gtt_access

Iirc this was used by uxa for persistent mmpas of the frontbuffer. For
mesa all the set_domain stuff needed before a synchronized mmap is handled
within the bufmgr, so no reason ever to call this.

Inline the implementation into its only internal user.

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>