Ilia Mirkin [Wed, 9 Jul 2014 04:41:11 +0000 (00:41 -0400)]
nvc0/ir: add kepler+ support for indirect texture references
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Ilia Mirkin [Wed, 6 Aug 2014 05:22:49 +0000 (01:22 -0400)]
nvc0/ir: add base tex offset for fermi indirect tex case
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Kenneth Graunke [Mon, 11 Aug 2014 22:05:54 +0000 (15:05 -0700)]
i965: Revert part of
f5cc3fdcf1680b116612fac7c39f1bd79f5e555e.
Fixes non-termination in various Piglit tests.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Eric Anholt [Sat, 9 Aug 2014 18:01:53 +0000 (11:01 -0700)]
vc4: Flip which primitives are considered front-facing.
This mostly fixes glxgears rendering.
Eric Anholt [Sat, 9 Aug 2014 18:00:51 +0000 (11:00 -0700)]
vc4: Don't forget to set the depth clear value in the packet.
This gets glxgears partially rendering again.
Eric Anholt [Tue, 5 Aug 2014 21:24:29 +0000 (14:24 -0700)]
vc4: Add support for gl_FragCoord.
This isn't passing all tests (glsl-fs-fragcoord-zw-ortho, for example),
but it does get a bunch more tests passing.
v2: Rebase on helpers change.
Eric Anholt [Tue, 5 Aug 2014 21:23:40 +0000 (14:23 -0700)]
vc4: Refactor shader input setup again.
This makes some space for handling special inputs like fragcoords.
Eric Anholt [Tue, 5 Aug 2014 18:00:51 +0000 (11:00 -0700)]
vc4: Clean up the tile alloc buffer size.
This prevents some simulator assertion failures, but it does mean (since
I've dropped the "* 16" padding) that on real hardware you need a kernel
that does overflow memory management (currently, "drm/vc4: Add support for
binner overflow memory allocation." in my kernel tree).
Eric Anholt [Tue, 5 Aug 2014 18:00:08 +0000 (11:00 -0700)]
vc4: Clarify some values implicitly chosen for binning config.
These #defines are 0, but it should help make math above make more sense.
Eric Anholt [Tue, 5 Aug 2014 17:54:56 +0000 (10:54 -0700)]
vc4: Improve simulator memory allocation.
This should reduce a bunch of spurious failures in sim.
Eric Anholt [Tue, 5 Aug 2014 01:30:33 +0000 (18:30 -0700)]
vc4: Handle stride==0 in VBO validation
Eric Anholt [Mon, 4 Aug 2014 23:38:07 +0000 (16:38 -0700)]
vc4: Stash some debug code for looking at what BOs are at what hindex.
When you're debugging validation, it's nice to know what the BOs are for.
Eric Anholt [Mon, 4 Aug 2014 20:01:29 +0000 (13:01 -0700)]
vc4: Use GEM under simulation even for non-winsys BOs.
In addition to reducing sim-specific code, it also avoids our local handle
allocation conflicting with the host GEM's handle numbering, which was
causing vc4_gem_hindex() to not distinguish between winsys BOs and the
same-numbered non-winsys bo.
Eric Anholt [Mon, 4 Aug 2014 20:00:56 +0000 (13:00 -0700)]
vc4: Don't forget to unmap the GEM BO when freeing.
Otherwise it'll stick around forever.
Eric Anholt [Sun, 3 Aug 2014 04:28:34 +0000 (21:28 -0700)]
vc4: Add validation of raster-format textures.
... and reject everything else, for now.
v2: Rebase on v2 of the rendering config validation change.
Eric Anholt [Sun, 3 Aug 2014 04:23:20 +0000 (21:23 -0700)]
vc4: Drop VC4_PACKET_PRIMITIVE_LIST_FORMAT.
It's not relevant to our command streams any more.
v2: Fix indentation and a typo in the comment.
Eric Anholt [Sun, 3 Aug 2014 04:06:50 +0000 (21:06 -0700)]
vc4: Add validation that vertex indices don't overflow VBO bounds.
Eric Anholt [Sun, 3 Aug 2014 03:44:39 +0000 (20:44 -0700)]
vc4: Fix the shader record size for extended strides.
It turns out they aren't packed when attributes are missing, according to
both docs and simulation.
Eric Anholt [Sun, 3 Aug 2014 03:44:39 +0000 (20:44 -0700)]
vc4: Fix the shader record size for extended strides.
It turns out they aren't packed when attributes are missing, according to
both docs and simulation.
v2: Drop unused variable.
Eric Anholt [Sun, 3 Aug 2014 03:30:18 +0000 (20:30 -0700)]
vc4: Add a bunch of validation of render mode configuration.
v2: Fix a build break after some previous rebase.
Eric Anholt [Sun, 3 Aug 2014 03:19:38 +0000 (20:19 -0700)]
vc4: Store the (currently always linear) tiling format in the resource.
Eric Anholt [Sat, 2 Aug 2014 00:11:38 +0000 (17:11 -0700)]
vc4: Add a bunch of validation of the binning mode config.
Eric Anholt [Sat, 2 Aug 2014 03:23:31 +0000 (20:23 -0700)]
vc4: Validate that the same BO doesn't get reused for different purposes.
We don't care if things like vertex data get smashed by render target
data, but we do need to make sure that shader code doesn't get rendered
to.
v2: Fix overflowing read of gl_relocs[] that incorrect flagged of some
VBOs as shader code.
Eric Anholt [Sat, 2 Aug 2014 00:31:40 +0000 (17:31 -0700)]
vc4: Use the packet #defines in the kernel validation code.
Eric Anholt [Sat, 2 Aug 2014 00:17:03 +0000 (17:17 -0700)]
vc4: Rename GEM_HANDLES to be in a namespace.
It's not a real VC4 hardware packet, but I've put in a comment to explain
it.
Eric Anholt [Sat, 2 Aug 2014 00:05:21 +0000 (17:05 -0700)]
vc4: Clean up TMU write validation.
The comment conflicted with the support in the code, so I moved the TMU
write validation to where the comment was, and dropped some dead arguments
from the functions while changing their signatures.
Eric Anholt [Sat, 2 Aug 2014 00:01:44 +0000 (17:01 -0700)]
vc4: Update a comment about shader validation
Eric Anholt [Fri, 1 Aug 2014 23:02:37 +0000 (16:02 -0700)]
vc4: Add proper translation from Zc to Zs for vertex output.
This fixes the remaining failure in depthfunc.
Eric Anholt [Fri, 1 Aug 2014 20:32:49 +0000 (13:32 -0700)]
vc4: Add support for depth clears and tests within a tile.
This doesn't load/store the Z contents across submits yet. It also
disables early Z, since it's going to require tracking of Z functions
across multiple state updates to track the early Z direction and whether
it can be used.
v2: Move the key setup to before the search for the key.
Eric Anholt [Fri, 1 Aug 2014 22:33:06 +0000 (15:33 -0700)]
vc4: Avoid flushing when mapping buffers that aren't in the batch.
This should prevent a bunch of unnecessary flushes for things like
updating immediate vertex data.
Eric Anholt [Thu, 31 Jul 2014 05:17:56 +0000 (22:17 -0700)]
vc4: Drop the flush at the end of the draw
Now we actally get multiple draw calls per submit.
Eric Anholt [Fri, 1 Aug 2014 18:24:29 +0000 (11:24 -0700)]
vc4: Align following shader recs to 16 bytes.
Otherwise, the low address bits will end up being interpreted as attribute
counts.
Eric Anholt [Thu, 31 Jul 2014 20:14:00 +0000 (13:14 -0700)]
vc4: Fix a potential src buffer overflow in shader rec validation.
Eric Anholt [Thu, 31 Jul 2014 19:19:29 +0000 (12:19 -0700)]
vc4: Keep a reference to BOs queued for rendering.
Otherwise, once we're not flushing at the end of every draw, we'll free
things like gallium resources, and free the backing GEM object, before
we've flushed the rendering using it to the kernel.
Eric Anholt [Thu, 31 Jul 2014 19:46:13 +0000 (12:46 -0700)]
vc4: Compute the proper end address of the relocated command lists.
render_cl_size/bin_cl_size includes relocations, while the hardware buffer
doesn't. If you don't emit a HALT packet, the command parser continues
until the end register's value. We can't allow executing unvalidated
buffer contents (and it's actually harmful in the render lists Mesa is
emitting, since VC4_PACKET_STORE_MS_TILE_BUFFER_AND_EOF doesn't trigger a
halt).
Eric Anholt [Thu, 31 Jul 2014 19:45:41 +0000 (12:45 -0700)]
vc4: Walk tiles horizontally, then vertically.
I was confused looking at my addresses in dumps because I was seeing the
tile branch offsets jumping all over.
Eric Anholt [Wed, 23 Jul 2014 18:21:04 +0000 (11:21 -0700)]
vc4: Track clears veresus uncleared draws, and the clear color.
This is a step toward queueing more than one draw per frame.
Fixes piglit attribute0 test, since we get a working clear color now.
Eric Anholt [Thu, 31 Jul 2014 18:23:22 +0000 (11:23 -0700)]
vc4: Move the rest of RCL setup to flush time.
We only want to set up render target config and clear colors once per
frame.
Eric Anholt [Thu, 31 Jul 2014 18:22:17 +0000 (11:22 -0700)]
vc4: Move render command list calls to vc4_flush()
Eric Anholt [Thu, 31 Jul 2014 18:19:41 +0000 (11:19 -0700)]
vc4: Move bin command list ending commands to vc4_flush()
Eric Anholt [Wed, 23 Jul 2014 03:16:10 +0000 (20:16 -0700)]
vc4: Rename fields in the kernel interface.
I decided I didn't like "len" compared to "size", and I keep typing
shader_rec instead of shader_record[s] elsewhere, so make it consistent.
Eric Anholt [Wed, 23 Jul 2014 03:10:01 +0000 (20:10 -0700)]
vc4: Fix things to validate more than one shader state in a submit.
Eric Anholt [Mon, 21 Jul 2014 18:27:35 +0000 (11:27 -0700)]
vc4: Rewrite the kernel ABI to support texture uniform relocation.
This required building a shader parser that would walk the program to find
where the texturing-related uniforms are in the uniforms stream.
Note that as of this commit, a new kernel is required for rendering on
actual VC4 hardware (currently that commit is named "drm/vc4: Introduce
shader validation and better command stream validation.", but is likely to
be squashed as part of an eventual merge of the kernel driver).
Eric Anholt [Mon, 21 Jul 2014 18:26:24 +0000 (11:26 -0700)]
vc4: Add docs for the drm interface
Eric Anholt [Fri, 18 Jul 2014 21:18:23 +0000 (14:18 -0700)]
vc4: Add load/store to the validator
Eric Anholt [Fri, 18 Jul 2014 20:06:01 +0000 (13:06 -0700)]
vc4: Switch simulator to using kernel validator
This ensures that when I'm using the simulator, I get a closer match to
what behavior on real hardware will be. It lets me rapidly iterate on the
kernel validation code (which otherwise has a several-minute turnaround
time), and helps catch buffer overflow bugs in the userspace driver
faster.
Eric Anholt [Fri, 18 Jul 2014 20:28:34 +0000 (13:28 -0700)]
vc4: Drop pointless shader state struct
Eric Anholt [Thu, 17 Jul 2014 04:39:05 +0000 (21:39 -0700)]
vc4: Add support for texture rectangles
v2: Rebase on helpers change.
Eric Anholt [Tue, 15 Jul 2014 19:29:32 +0000 (12:29 -0700)]
vc4: Add support for texturing (under simulation)
Only rgba8888 works, and only a single texture unit, and it's only under
simulation because I haven't built the kernel interface yet.
v2: Rebase on helpers.
v3: Fold in the don't-break-the-arm-build fix.
Eric Anholt [Mon, 11 Aug 2014 21:40:06 +0000 (14:40 -0700)]
vc4: Drop PIPE_SHADER_CAP_MAX_ADDRS
Fixes the build since
c10332bbb8889d733bdaa729ef23cbd90176b55d
Marek Olšák [Wed, 6 Aug 2014 21:58:10 +0000 (23:58 +0200)]
gallium: remove PIPE_SHADER_CAP_MAX_ADDRS
This limit is fixed in Mesa core and cannot be changed.
It only affects ARB_vertex_program and ARB_fragment_program.
The minimum value for ARB_vertex_program is 1 according to the spec.
The maximum value for ARB_vertex_program is limited to 1 by Mesa core.
The value should be zero for ARB_fragment_program, because it doesn't
support ARL.
Finally, drivers shouldn't mess with these values arbitrarily.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Marek Olšák [Sun, 3 Aug 2014 03:36:23 +0000 (05:36 +0200)]
st/mesa: compute supported GL versions at DRIscreen creation
This computes all GL versions before any context is created.
It's a requirement for GLX_MESA_query_renderer.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Marek Olšák [Sun, 3 Aug 2014 11:58:20 +0000 (13:58 +0200)]
gallium: pass st_config_options to query_versions
So move it from dri_context to dri_screen.
This will be needed for version computations.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Marek Olšák [Sun, 3 Aug 2014 03:35:10 +0000 (05:35 +0200)]
mesa: return version 0 if the computed core profile version is too low
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Marek Olšák [Sun, 3 Aug 2014 03:27:10 +0000 (05:27 +0200)]
mesa: add _mesa_get_version, a ctx-independent variant of _mesa_compute_version
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Marek Olšák [Sun, 3 Aug 2014 03:17:08 +0000 (05:17 +0200)]
mesa: add a context-independent variant of _mesa_override_gl_version
v2: changed GLboolean -> bool
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Marek Olšák [Sun, 3 Aug 2014 02:51:31 +0000 (04:51 +0200)]
mesa: make _mesa_init_constants context-independent and public
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Marek Olšák [Sun, 3 Aug 2014 02:42:50 +0000 (04:42 +0200)]
mesa: make _mesa_init_extensions context-independent
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Marek Olšák [Sun, 3 Aug 2014 02:36:19 +0000 (04:36 +0200)]
st/mesa: make st_init_limits context-independent
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Marek Olšák [Sun, 3 Aug 2014 02:31:56 +0000 (04:31 +0200)]
mesa: move ShaderCompilerOptions into gl_constants
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Marek Olšák [Sun, 3 Aug 2014 02:20:31 +0000 (04:20 +0200)]
st/mesa: make st_init_extensions context-independent
Setting Const.MaxSamples needed a rework, so that it doesn't call
st_choose_format, which depends on st_context.
Other than that, there is no change in functionality.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Marek Olšák [Sun, 3 Aug 2014 01:40:49 +0000 (03:40 +0200)]
mesa: make _mesa_override_glsl_version context-independent
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Marek Olšák [Sun, 3 Aug 2014 00:54:52 +0000 (02:54 +0200)]
gallium/stapi: move setting GL versions to the state tracker
All flags are set for st/mesa, so the state tracker doesn't have to check
them.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Marek Olšák [Sat, 2 Aug 2014 19:38:25 +0000 (21:38 +0200)]
st/mesa: convert the ETC1 format to an uncompressed one if unsupported
I don't know of any hardware which supports it.
With this, GL_OES_compressed_ETC1_RGB8_texture is supported if RGBA8
is supported.
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Marek Olšák [Sat, 2 Aug 2014 19:00:41 +0000 (21:00 +0200)]
st/mesa: add st_context parameter to st_mesa_format_to_pipe_format
This will be used by the next commit.
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Marek Olšák [Sat, 2 Aug 2014 20:32:25 +0000 (22:32 +0200)]
st/mesa: advertise ARB_ES3_compatibility if GLSL 3.30 and ETC2 are supported
Marek Olšák [Sat, 2 Aug 2014 18:31:07 +0000 (20:31 +0200)]
st/mesa: add support for ETC2 formats
The formats are emulated by translating them into plain uncompressed
formats, because I don't know of any hardware which supports them.
This is required for GLES 3.0 and ARB_ES3_compatibility (GL 4.3).
Marek Olšák [Sat, 2 Aug 2014 18:24:34 +0000 (20:24 +0200)]
mesa: add helper _mesa_is_format_etc2
v2: renamed GLboolean -> bool
Reviewed-by: Glenn Kennard <glenn.kennard@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Brian Paul [Mon, 11 Aug 2014 18:59:46 +0000 (12:59 -0600)]
mesa: add missing GLAPIENTRY in copyimage.c
Fixes MinGW build. Trivial.
Jason Ekstrand [Fri, 8 Aug 2014 21:30:25 +0000 (14:30 -0700)]
i965/cse: Don't eliminate instructions with side-effects
This casues problems when converting atomics to use the GRF. Sometimes the atomic operation would get eaten by CSE when it shouldn't.
v2: Roll the has_side_effects check into is_expression
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Jason Ekstrand [Mon, 4 Aug 2014 22:17:15 +0000 (15:17 -0700)]
docs/GL3: Mark ARB_copy_image as implemented on i965
Jason Ekstrand [Fri, 27 Jun 2014 23:05:37 +0000 (16:05 -0700)]
i965: Add support for ARB_copy_image
This, together with the meta path, provides a complete implemetation of
ARB_copy_image.
v2: Add a fallback memcpy path for when the texture is too big for the
blitter
v3: Properly support copying between two places on the same texture in the
memcpy fallback
v4: Properly handle blit between the same two images in the fallback path
v5: Properly handle blit between the same two compressed images in the
fallback path
v6: Fix a typo in a comment
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Jason Ekstrand [Fri, 25 Jul 2014 21:08:59 +0000 (14:08 -0700)]
mesa/meta: Add a partial implementation of CopyImageSubData
This provides an implementation of CopyImageSubData that works if both
textures are uncompressed. This implementation works by using a
combination of texture views and BlitFramebuffer. If one of the textures
is compressed, it returns false and the driver is expected to provide a
fallback.
v2: Don't leak fbo's
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
v3: Change glGen/DeleteTextures to _mesa_Gen/DeleteTextures
Jason Ekstrand [Fri, 25 Jul 2014 21:07:49 +0000 (14:07 -0700)]
mesa/meta: Make _mesa_meta_bind_fbo_image also take a framebuffer target
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
Jason Ekstrand [Fri, 27 Jun 2014 22:34:53 +0000 (15:34 -0700)]
mesa: Add GL API support for ARB_copy_image
This adds the API entrypoint, error checking logic, and a driver hook for
the ARB_copy_image extension.
v2: Fix a typo in ARB_copy_image.xml and add it to the makefile
v3: Put ARB_copy_image.xml in the right place alphebetically in the
makefile and properly prefix the commit message
v4: Fixed some line wrapping and added a check for null
v5: Check for incomplete renderbuffers
Signed-off-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Neil Roberts <neil@linux.intel.com>
v6: Update dispatch_sanity for the addition of CopyImageSubData
Matt Turner [Mon, 11 Aug 2014 02:03:34 +0000 (19:03 -0700)]
i965/fs: Keep track of the register that hold delta_x/delta_y.
They're needed in register allocation. Fixes a regression since
afe3d155.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78875
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Matt Turner [Mon, 11 Aug 2014 04:32:24 +0000 (21:32 -0700)]
i965: Mark branch unreachable in sampler state code.
Silences some uninitialized variable warnings.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Brian Paul [Fri, 8 Aug 2014 21:10:31 +0000 (15:10 -0600)]
mesa: simplify _mesa_update_draw_buffers()
There's no need to copy the array of DrawBuffer enums to a temp array.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Fri, 8 Aug 2014 21:01:50 +0000 (15:01 -0600)]
mesa: fix assertion in _mesa_drawbuffers()
Fixes failed assertion when _mesa_update_draw_buffers() was called
with GL_DRAW_BUFFER == GL_FRONT_AND_BACK. The piglit gl30basic hit
this.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Brian Paul [Fri, 8 Aug 2014 19:22:28 +0000 (13:22 -0600)]
mesa: whitespace, 80-column wrapping in program.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Fri, 8 Aug 2014 19:19:49 +0000 (13:19 -0600)]
mesa: simplify/rename _mesa_init_program_struct()
No need to return a value. Remove unused ctx parameter. Remove
_mesa_ prefix since it's static.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Fri, 8 Aug 2014 13:51:47 +0000 (07:51 -0600)]
st/mesa: use PRId64 for printing 64-bit ints
v2: use signed types/formats
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Fri, 8 Aug 2014 13:49:33 +0000 (07:49 -0600)]
mesa: use PRId64 for printing 64-bit ints
Silences MinGW warnings:
warning: unknown conversion type character ‘l’ in format [-Wformat]
warning: too many arguments for format [-Wformat-extra-args]
v2: use signed types/formats
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Fri, 8 Aug 2014 13:46:45 +0000 (07:46 -0600)]
mesa: define and use ALL_TYPE_BITS in varray.c code
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Brian Paul [Fri, 8 Aug 2014 13:45:42 +0000 (07:45 -0600)]
mesa: add comment that GL_CLIP_DISTANCE0 == GL_CLIP_PLANE0 in enable.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Maarten Lankhorst [Mon, 11 Aug 2014 11:16:05 +0000 (13:16 +0200)]
configure.ac: Do not require llvm on x32
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Maarten Lankhorst <dev@mblankhorst.nl>
Neil Roberts [Tue, 1 Jul 2014 15:04:56 +0000 (16:04 +0100)]
i965: Don't check for format differences when using the blorp blitter
Previously the blorp blitter wouldn't be used if the source and destination
buffer had a different format other than swizzling between RGB and BGR and
adding or removing a dummy alpha channel. However there's no reason why the
blorp code path can't be used to do almost all format conversions so this
patch just removes the checks. However it does explicitly disable converting
to/from MESA_FORMAT_Z24_UNORM_X8_UINT because there is a similar check
brw_blorp_copytexsubimage.
This doesn't cause any Piglit test regressions at least on Ivybridge.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Fri, 11 Jul 2014 22:54:11 +0000 (15:54 -0700)]
i965/eu: Allow math on immediates on Broadwell.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Thu, 3 Jul 2014 22:01:58 +0000 (15:01 -0700)]
i965/eu: Update jump distance scaling for Broadwell.
Broadwell measures jump distances in bytes, so we need to scale by 16.
v2: Update the function in brw_eu.h, not in brw_eu_emit.c.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Mon, 30 Jun 2014 15:00:25 +0000 (08:00 -0700)]
i965/eu: Refactor jump distance scaling to use a helper function.
Different generations of hardware measure jump distances in different
units. Previously, every function that needed to set a jump target open
coded this scaling, or made a hardcoded assumption (i.e. just used 2).
Most functions start with the number of instructions to jump, and scale
up to the hardware-specific value. So, I made the function match that.
Others start with a byte offset, and divide by a constant (8) to obtain
the jump distance. This is actually 16 / 2 (the jump scale for Gen5-7).
v2: Make the helper a static inline defined in brw_eu.h, instead of
an actual function in brw_eu_emit.c (as suggested by Matt).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Mon, 30 Jun 2014 15:05:42 +0000 (08:05 -0700)]
i965/eu: Set UIP on ELSE instructions on Broadwell.
Broadwell adds UIP on ELSE instructions.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Mon, 30 Jun 2014 16:22:27 +0000 (09:22 -0700)]
i965/eu: Make it clear that brw_patch_break_count only runs on Gen4-5.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Mon, 30 Jun 2014 15:06:43 +0000 (08:06 -0700)]
i965/eu: Make it clear that brw_find_loop_end only runs on Gen6+.
It has Gen6+ knowledge baked in, and indeed is only called for Gen6+,
but it wasn't immediately obvious that this was the case.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Mon, 30 Jun 2014 14:51:51 +0000 (07:51 -0700)]
i965/eu: Port Broadwell CMP destination type hack to brw_eu_emit.c.
See gen8_generator::CMP().
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Sat, 28 Jun 2014 22:30:58 +0000 (15:30 -0700)]
i965/eu: Explicitly disable instruction compaction on Broadwell for now.
Until now, it's been off implicitly: we never call the compactor
function. When we merge the generators, we'll start calling it, so we
should make it do nothing.
Matt will enable instruction compaction properly later.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Fri, 11 Jul 2014 22:48:14 +0000 (15:48 -0700)]
i965/eu: Use Haswell atomic messages on Broadwell.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Kenneth Graunke [Mon, 30 Jun 2014 14:26:30 +0000 (07:26 -0700)]
i965/eu: Change gen == 7 to gen >= 7 in a couple brw_eu_emit.c cases.
Broadwell is going to use the brw_eu_emit.c code soon. We want to get
the fake MRF handling and URB HWord channel mask handling.
We don't need the CMP thread switch workaround, though.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Ben Widawsky [Wed, 30 Jul 2014 18:39:06 +0000 (11:39 -0700)]
i965/clip: Removing scissor atom
Now that we no longer use ctx->DrawBuffer->_Xmin and related fields to
program the screen-space viewport extents, we don't depend on any
scissoring state. So we can drop the +_NEW_SCISSOR dependency.
On GEN8, a change in scissor state does not effect anything for the
clipper/sf hardware state. The hardware will always do the right thing
once the viewport extents are programmed. We can therefore remove the
unecessary state emission.
Ken originally spotted this.
v2: Reword the commit message. Remove spurious hunk.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Ben Widawsky [Thu, 24 Jul 2014 00:55:40 +0000 (17:55 -0700)]
i965/guardband: Enable for all viewport dimensions (GEN8+)
The goal of guardband clipping is to try to avoid 3d clipping because it
is an expensive operation. When guardband clipping is disabled, all
geometry that intersects the viewport is sent to the FF 3d clipper.
Objects which are entirely enclosed within the viewport are said to be
"trivially accepted" while those entirely outside of the viewport are,
"trivially rejected".
When guardband clipping is turned on the above behavior is changed such
that if the geometry is within the guardband, and intersects the
viewport, it skips the 3d clipper. Prior to GEN8, this was problematic
if the viewport was smaller than the screen as it could allow for
rendering to occur outside of the viewport. That could be mitigated if
the programmer specified a scissor region which was less than or equal
to the viewport - but this is not required for correctness in OpenGL. In
theory you could be clever with the guardband so as not to invoke this
problem. We do not do this, and have no data that suggests we should
bother (nor the converse data).
With viewport extents in place on GEN8, it should be safe to turn on
guardband clipping for all cases
While here, add a comment to the code which confused me thoroughly.
v2: Update grammar in commit message. Reword comments based on Ken's
suggestion.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Ben Widawsky [Thu, 3 Jul 2014 00:07:34 +0000 (17:07 -0700)]
i965: Simplify viewport extents programming on GEN8
Viewport extents are a 3rd rectangle that defines which pixels get
discarded as part of the rasterization process. The actual pixels drawn
to the screen are an intersection of the drawing rectangle, the viewport
extents, and the scissor rectangle. It permits the use of guardband
clipping in all cases (see later patch). The actual pixels drawn to the
screen are an intersection of the drawing rectangle, the viewport
extents, and the scissor rectangle.
Scissor rectangle is not super important for this discussion as it should
always help do the right thing provided the programmer uses it.
switch (viewport dimensions, drawrect dimension) {
case viewport > drawing rectangle: no effects; break;
case viewport == drawing rectangle: no effects; break;
case viewport < drawing rectangle:
Pixels (after the viewport transformation but before expensive
rastersizing and shading operations) which are outside of the
viewport are discarded.
}
I am unable to find a test case where this improves performance, but in
all my testing it doesn't hurt performance, and intuitively, it should
not ever hurt performance. It also permits us to use the guardband more
freely (see upcoming patch).
v2: Updating commit message.
v3: Commit message updates requested by Ken
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>