Kenneth Graunke [Fri, 7 Sep 2012 20:24:16 +0000 (13:24 -0700)]
i965: Refactor texture swizzle generation into a helper.
It's going to be reused in a second place soon.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Vincent Lejeune [Mon, 24 Sep 2012 14:04:26 +0000 (16:04 +0200)]
radeon/llvm: improve select_cc lowering to generate CND* more often
v2: - Simplify isZero()
- Remove a unused function prototype
- Clean whitespace trails
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Chad Versace [Thu, 20 Sep 2012 16:54:29 +0000 (18:54 +0200)]
intel: Fix size of temporary etc1 buffer
Fixes valgrind errors in piglit test
oes_compressed_etc1_rgb8_texture-miptree: an invalid write in
_mesa_store_compressed_store_texsubimage() at line 4406 and invalid reads
in texcompress_etc_tmp.h:etc1_parse_block().
The calculation of the size of the temporary etc1 buffer allocated by
intel_miptree_map_etc1() was incorrect. Sometimes the allocated buffer was
too small, sometimes too large. This patch corrects the size to that
expected by _mesa_store_compressed_store_texsubimage().
Note: This is candidate for the 9.0 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Alex Deucher [Wed, 26 Sep 2012 13:34:59 +0000 (09:34 -0400)]
radeonsi: fix truncated register define.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Brian Paul [Sat, 22 Sep 2012 15:30:24 +0000 (09:30 -0600)]
mesa: move _mesa_es_error_check_format_and_type() to glformats.c
Where the non-ES _mesa_error_check_format_and_type() function lives.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Brian Paul [Sat, 22 Sep 2012 15:30:24 +0000 (09:30 -0600)]
mesa: move GL_HALF_FLOAT_OES definition to glheader.h
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Brian Paul [Sat, 22 Sep 2012 15:30:24 +0000 (09:30 -0600)]
mesa: minor fix to glTexSubImage error message
Brian Paul [Sat, 22 Sep 2012 15:30:24 +0000 (09:30 -0600)]
mesa: consolidate sub-texture error checking code
Do all error checking of glTexSubImage, glCopyTexSubImage and
glCompressedTexSubImage's xoffset, yoffset, zoffset, width, height, and
depth params in one place.
Brian Paul [Sat, 22 Sep 2012 15:30:24 +0000 (09:30 -0600)]
mesa: consolidate glTexSubImage() error checking
Brian Paul [Sat, 22 Sep 2012 15:30:24 +0000 (09:30 -0600)]
mesa: consolidate glCompressedTexSubImage() error checking
Do all the checking in one function instead of two and fix up some of
the error checking.alignment check
Brian Paul [Sat, 22 Sep 2012 15:30:23 +0000 (09:30 -0600)]
mesa: consolidate subtexture xoffset/yoffset/width/height error checking code
This is the code that checks if a subtexture region is aligned to the
compressed format's block size.
Brian Paul [Sat, 22 Sep 2012 15:30:23 +0000 (09:30 -0600)]
mesa: consolidate glCopyTexSubImage error checking
Do all the checking in one function instead of two.
Brian Paul [Sat, 22 Sep 2012 15:30:23 +0000 (09:30 -0600)]
mesa: fix incorrect error for glCompressedSubTexImage
If a subtexture region isn't aligned to the compressed block size,
return GL_INVALID_OPERATION, not gl_INVALID_VALUE.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Eric Anholt <eric@anholt.net>
Christian Koenig [Thu, 20 Sep 2012 15:20:51 +0000 (17:20 +0200)]
radeonsi: move draw cmds to si_commands.c
Signed-off-by: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Christian Koenig [Thu, 20 Sep 2012 10:15:11 +0000 (12:15 +0200)]
radeonsi: start seperating commands into si_commands.c
Signed-off-by: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Christian Koenig [Thu, 20 Sep 2012 10:00:30 +0000 (12:00 +0200)]
radeonsi: get rid of evergreen_hw_context.c
Signed-off-by: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Christian Koenig [Wed, 19 Sep 2012 12:44:17 +0000 (14:44 +0200)]
radeonsi: remove unused code
Signed-off-by: Christian Koenig <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Christian König [Fri, 14 Sep 2012 15:05:34 +0000 (17:05 +0200)]
radeonsi: start reworking inferred state handling
Instead of tracking the inferred state changes separately
just check if queued and emitted states are the same.
This patch just reworks the update of the SPI map between
vs and ps, but there are probably more cases like this.
Signed-off-by: Christian König <deathsimple@vodafone.de>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Paul Berry [Mon, 24 Sep 2012 21:47:12 +0000 (14:47 -0700)]
gles3: Prohibit set/get of GL_FRAMEBUFFER_SRGB.
GLES 3 supports sRGB functionality, but it does not expose the
GL_FRAMEBUFFER_SRGB enable/disable bit. Instead the implementation
is expected to behave as though that bit is always enabled.
This patch ensures that ctx->Color.sRGBEnabled (the internal variable
tracking GL_FRAMEBUFFER_SRGB) is initially true in GLES 2/3 contexts,
and that it cannot be modified through the GLES 3 API.
This is safe for GLES 2, since ctx->Color.sRGBEnabled has no effect on
non-sRGB formats, and GLES 2 doesn't support any sRGB formats.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Paul Berry [Mon, 24 Sep 2012 21:38:19 +0000 (14:38 -0700)]
meta: Properly save/restore GL_FRAMEBUFFER_SRGB in Meta.
Previously, meta logic was saving and restoring the value of
GL_FRAMEBUFFER_SRGB in an ad-hoc fashion. As a result, it was not
properly disabled and/or restored for some meta operations.
This patch causes GL_FRAMEBUFFER_SRGB to be saved/restored in the
conventional way of meta-ops (using _mesa_meta_begin() and
_mesa_meta_end()). It is now reliably saved/restored for
_mesa_meta_BlitFramebuffer, _mesa_meta_GenerateMipmap, and
decompress_texture_image, and preserved for all other meta ops.
Fixes piglit tests "ARB_framebuffer_sRGB/blit renderbuffer
{linear_to_srgb,srgb} scaled {disabled,enabled}".
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Paul Berry [Mon, 24 Sep 2012 21:24:28 +0000 (14:24 -0700)]
enable: Create _mesa_set_framebuffer_srgb() function for use by meta ops.
GLES3 supports sRGB formats, but it does not support the
GL_FRAMEBUFFER_SRGB enable/disable flag (instead it behaves as if this
flag is always enabled). Therefore, meta ops that need to disable
GL_FRAMEBUFFER_SRGB will need a backdoor mechanism to do so when the
API is GLES3.
We were already doing a similar thing for GL_MULTISAMPLE, which has
the same constraints.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Matt Turner [Mon, 24 Sep 2012 16:56:38 +0000 (09:56 -0700)]
targets/xorg-i915: Rename driver to i915_drv.so.
modesetting_drv.so is undescriptive and collides with
xf86-video-modesetting.
Reviewed-by: Jakob Bornecrantz <jakob@vmware.com>
Chad Versace [Tue, 4 Sep 2012 19:15:29 +0000 (12:15 -0700)]
intel: Improve teximage perf for Google Chrome paint rects (v3)
This patch reduces the time spent in glTexImage and glTexSubImage by
over 5x on Sandybridge for the workload described below.
It adds a new fast path for glTexImage2D and glTexSubImage2D,
intel_texsubimage_tiled_memcpy, which is optimized for Google Chrome's
paint rectangles. The fast path is implemented only for 2D GL_BGRA
textures for chipsets with a LLC.
=== Performance Analysis ===
Workload description:
Personalize your google.com page with a wallpaper. Start chromium
with flags "--ignore-gpu-blacklist --enable-accelerated-painting
--force-compositing-mode". Start recording with chrome://tracing. Visit
google.com and wait for page to finish rendering. Measure the time spent
by process CrGpuMain in GLES2DecoderImpl::HandleTexImage2D and
HandleTexSubImage2D.
System config:
cpu: Sandybridge Mobile GT2+ (0x0126)
kernel 3.4.9 x86_64
chromium 21.0.1180.89 (154005)
Statistics:
| N Median Avg Stddev
--------------|-------------------------
before (msec) | 8 472.5 463.75 72.6
after (msec) | 8 78.0 79.6 5.7
Arithmetic difference at 95.0% confidence:
-384.1 +/- 55.2 msec
-82.8% +/- 11.9%
Ratio at 95.0% confidence:
5.81 +/- 0.119
v2:
- Replace check for `intel->gen >= 6` with `intel->has_llc`, per
danvet.
- Fix typo in comment, s/throuh/through/.
- Swap 'before' and 'after' rows in stat table.
v3:
- If the current batch references the bo, then flush batch before mapping
the bo. Found by Chris.
- Restrict supported texture images to level 0 of target
GL_TEXTURE_2D. This avoids an arithmetic bug in calculating image
offsets within the miptree, found by Paul. This restriction does not
diminish this patch's benefit to Chrome OS performance.
- Use less instructions for bit6 swizzling, suggested by Paul.
- Remove erroneous comment about Y-tiling, for Paul.
- Print perf_debug messages when flushing and stalling.
- Update stats in commit message; run workload under a release build
rather than a debug build.
Note: This is a candidate for the 9.0 branch.
Acked-by: Eric Anholt <eric@anholt.net>
CC: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Tom Stellard [Mon, 24 Sep 2012 21:07:55 +0000 (21:07 +0000)]
clover: Fix build with libclang v3.2
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tom Stellard [Mon, 17 Sep 2012 14:29:49 +0000 (14:29 +0000)]
clover: Query device for CL_DEVICE_MAX_MEM_ALLOC_SIZE v2
v2:
- Use driver reported values and don't correct them to the OpenCL
required minimum.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tom Stellard [Fri, 21 Sep 2012 20:19:14 +0000 (20:19 +0000)]
gallium: Add PIPE_COMPUTE_CAP_MAX_MEM_ALLOC_SIZE v2
v2:
- Add comment in screen.rst
- Report OpenCL required minimum for r600g
Tom Stellard [Thu, 13 Sep 2012 14:59:50 +0000 (14:59 +0000)]
r600g: Handle multiple kernels in the same program v2
v2:
- Use pc parameter of launch_grid
Blaž Tomažič [Thu, 13 Sep 2012 14:51:46 +0000 (14:51 +0000)]
clover: Handle multiple kernels in the same program v2
v2: Tom Stellard
- Use pc parameter of launch_grid()
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Brian Paul [Mon, 24 Sep 2012 21:23:20 +0000 (15:23 -0600)]
mesa: remove 'struct' from texenv_fragment_program
texenv_fragment_program is declared as a class. Fixes warnings with MSVC.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Wed, 19 Sep 2012 20:27:55 +0000 (13:27 -0700)]
i965: Allow fast depth clears if scissoring doesn't do anything.
A game we're working with leaves scissoring enabled, but frequently sets
the scissor rectangle to the size of the whole screen. In that case,
scissoring has no effect, so it's safe to go ahead with a fast clear.
Chad believe this should help with Oliver McFadden's "Dante" as well.
v2/Chad: Use the drawbuffer dimensions rather than the miptree slice
dimensions. The miptree slice may be slightly larger due to alignment
restrictions.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-and-tested-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Paul Berry [Wed, 19 Sep 2012 20:28:00 +0000 (13:28 -0700)]
i965: Don't spill "smeared" registers.
Fixes an assertion failure when compiling certain shaders that need both
pull constants and register spilling:
brw_eu_emit.c:204: validate_reg: Assertion `execsize >= width' failed.
NOTE: This is a candidate for release branches.
Signed-off-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Jay Cornwall [Sat, 22 Sep 2012 16:15:11 +0000 (11:15 -0500)]
nv50/ir/ra: Fix register interference tracking.
See fdo bug 55224.
Paul Berry [Mon, 24 Sep 2012 12:38:32 +0000 (05:38 -0700)]
i965/blorp: Fix sRGB MSAA resolves.
Commit
e2249e8c4d06a85d6389ba1689e15d7e29aa4dff (i965/blorp: Add
support for blits between SRGB and linear formats) changed blorp to
always configure surface states for in linear format (even if the
underlying surface is sRGB). This allowed sRGB-to-linear and
linear-to-sRGB blits to occur without causing the image to be
inappropriately brightened or darkened.
However, it broke sRGB MSAA resolves, since they rely on the
destination buffer format being sRGB in order to ensure that samples
are averaged together in sRGB-correct fashion.
This patch fixes the problem by instead configuring the source buffer
to use the *same* format as the destination buffer. This ensures that
the image won't be brightened or darkened, but preserves proper sRGB
averaging.
Fixes piglit tests "EXT_framebuffer_multisample/accuracy srgb".
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55265
NOTE: This is a candidate for stable release branches.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-and-tested-by: Kenneth Graunke <kenneth@whitecape.org>
Jonas Maebe [Sun, 9 Sep 2012 22:44:15 +0000 (00:44 +0200)]
darwin: do not create double-buffered offscreen pixel formats
http://xquartz.macosforge.org/trac/ticket/536
Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
Tom Stellard [Mon, 24 Sep 2012 20:49:43 +0000 (16:49 -0400)]
radeon/llvm: Fix instruction encoding for r600 family GPUs
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
https://bugs.freedesktop.org/show_bug.cgi?id=55217
Brian Paul [Mon, 24 Sep 2012 19:23:10 +0000 (13:23 -0600)]
build: remove signbit check in configure.ac
We now have a fallback macro in imports.h
This reverts part of
0f3ba405.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Brian Paul [Mon, 24 Sep 2012 19:20:33 +0000 (13:20 -0600)]
mesa: add signbit() macro
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tom Stellard [Mon, 24 Sep 2012 18:34:02 +0000 (18:34 +0000)]
r600g: Set RADEON_FLUSH_KEEP_TILING_FLAGS when emitting compute cs
Robert Bragg [Wed, 19 Sep 2012 15:12:08 +0000 (16:12 +0100)]
build: substitute X11_INCLUDES variable
There are a few automake files that reference $(X11_INCLUDES) such as
src/glx/Makefile.am but configure.ac wasn't declaring the variable for
substitution. This would break builds of glx if libxcb, for example, was
installed in its own prefix since AM_CFLAGS wouldn't coincidentally
list the needed include path in that case.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Matt Turner [Fri, 14 Sep 2012 23:04:40 +0000 (16:04 -0700)]
Use signbit() in IS_NEGATIVE and DIFFERENT_SIGNS
signbit() appears to be available everywhere (even MSVC according to
MSDN), so let's use it instead of open-coding some messy and confusing
bit twiddling macros.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54805
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Francisco Jerez [Mon, 24 Sep 2012 16:33:35 +0000 (18:33 +0200)]
clover: Silence narrowing conversion warnings in resource.cpp.
Tom Stellard [Fri, 21 Sep 2012 20:45:50 +0000 (20:45 +0000)]
clover: Handle NULL value for clEnqueueNDRangeKernel local_work_size
[ Francisco Jerez: Slight simplification. ]
Paul Berry [Wed, 12 Sep 2012 18:13:49 +0000 (11:13 -0700)]
i965/blorp: Increase Y alignment for multisampled stencil blits.
This patch is a band-aid fix for a bug in commit
5fd67fa (i965/blorp:
Reduce alignment restrictions for stencil blits), which causes
multisampled stencil blits to work incorrectly on Sandy Bridge.
When blitting to or from a normal stencil buffer, we have to use a
coordinate transformation that swizzles coordinates to account for the
fact that stencil buffers use W tiling, but the most similar tiling
format available for textures and render targets is Y tiling. The
differences between W and Y tiling cause pixels to be scrambled within
a block of size 8x4 (width x height) as measured relative to a W tile,
or 16x2 as measured relative to a Y tile. So in order to make sure
that pixels at the edges of the blit aren't lost, we need to align the
rendering rectangle (and the buffer sizes) to multiples of the 8x4
block size. This alignment happens in the brw_blorp_blit_params
constructor, whereas the determination of how to swizzle the
coordinates happens during code generation, in the
brw_blorp_blit_program class.
When blitting to or from a multisampled stencil buffer, the coordinate
swizzling is more complex, because it has to account for the
interleaving pattern of samples, which uses 4x4 blocks for 4x MSAA and
8x4 blocks for 8x MSAA. The end result is that if multisampling is in
use, the 16x2 block size (relative so a Y tile) needs to be expanded
to 16x4, and the corresponding size relative to a W tile expands to
8x8.
The problem doesn't affect Ivy Bridge severely enough to crop up in
Piglit tests because on Ivy Bridge we have to disable multisampling
when blitting *to* a multisampled stencil buffer (the blorp compiler
generates code to compensate for the fact that multisampling is
disabled). However I suspect a bug is still present because we don't
disable multisampling when blitting *from* a multisampled stencil
buffer.
This patch fixes the problem by doubling the vertical alignment
requirement when blitting to or from a multisampled stencil buffer,
and multisampling has not been disabled.
In the long run I would like to rework the brw_blorp_blit_params
constructor--it's difficult to follow and has had several subtle bugs
like this one. However this band-aid fix should be suitable for
cherry-picking to release branches.
Fixes Piglit tests "unaligned-blit {2,4} stencil {msaa,upsample}" on
Sandy Bridge.
NOTE: This is a candidate for stable release branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Brian Paul [Mon, 24 Sep 2012 14:06:56 +0000 (08:06 -0600)]
upgrade glext.h to version 85
NOTE: This is a candidate for the stable branches.
Brian Paul [Fri, 21 Sep 2012 14:09:01 +0000 (08:09 -0600)]
st/mesa: check for zero-size image in st_TestProxyTexImage()
Fixes divide by zero issue in llvmpipe driver.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Kenneth Graunke [Mon, 24 Sep 2012 05:38:58 +0000 (22:38 -0700)]
mesa: Silence narrowing warnings in ff_fragment_shader's emit_texenv().
Recent version of GCC report a warning for the implicit conversion from
int to float:
ff_fragment_shader.cpp:897:3: warning: narrowing conversion of '(1 << ((int)rgb_shift))' from 'int' to 'float' inside { } is ill-formed in C++11 [-Wnarrowing]
This is because floats cannot precisely represent all possible 32-bit
integer values. However, texenv code is all expected to be floating
point, so this should not be a problem.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Marek Olšák [Sun, 23 Sep 2012 15:14:25 +0000 (17:14 +0200)]
docs: fixup GL4.3 TODO list
From the OpenGL Registry:
"2012/08/13: specs named GL_ARB_debug_group, GL_ARB_debug_label, and
GL_ARB_debug_output2 were published in error during the initial OpenGL 4.3
release. All functionality in these documents was combined into
the extension GL_KHR_debug. They have been withdrawn from the registry,
and a few other extensions were renumbered to avoid holes in the numbering
scheme."
Vincent Lejeune [Thu, 6 Sep 2012 20:45:38 +0000 (22:45 +0200)]
radeon/llvm: support for interpolation intrinsics
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Marek Olšák [Fri, 14 Sep 2012 15:03:25 +0000 (17:03 +0200)]
draw: fix non-indexed draw calls if there's an index buffer
pipe_draw_info::indexed determines if it should be indexed and not
the presence of an index buffer.
This fixes crashes in r300g.
NOTE: This is a candidate for the stable branches.
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tom Stellard [Sat, 22 Sep 2012 00:07:14 +0000 (20:07 -0400)]
r600g: Fix build with LLVM compiler
Marek Olšák [Tue, 18 Sep 2012 23:29:17 +0000 (01:29 +0200)]
r600g: set QUANT_MODE on Cayman too
This fixes piglit/fbo-blit-stretched.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Marek Olšák [Tue, 18 Sep 2012 18:21:11 +0000 (20:21 +0200)]
r600g: use CS helpers to emit streamout state
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Marek Olšák [Tue, 18 Sep 2012 18:10:15 +0000 (20:10 +0200)]
r600g: remove initialization of unused loop register tables
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Marek Olšák [Tue, 18 Sep 2012 17:49:41 +0000 (19:49 +0200)]
r600g: remove now-unused SURFACE_BASE_UPDATE logic
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Marek Olšák [Tue, 18 Sep 2012 17:46:59 +0000 (19:46 +0200)]
r600g: remove unused CB registers from register lists
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Marek Olšák [Tue, 18 Sep 2012 17:42:29 +0000 (19:42 +0200)]
r600g: atomize framebuffer state
Tested on RS880, Evergreen and Cayman.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Marek Olšák [Mon, 17 Sep 2012 21:22:00 +0000 (23:22 +0200)]
r600g: don't snoop context state while building shaders
Let's use the shader key describing the state.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Anuj Phogat [Thu, 20 Sep 2012 20:17:19 +0000 (13:17 -0700)]
meta: Add on demand compilation of per target shader programs
A call to glGenerateMipmap() follows the generation of a relevant
shader program in setup_glsl_generate_mipmap().
To support all texture targets and to avoid compiling shaders
everytime, per target shader programs are compiled on demand
and saved for the next call.
Fixes float-texture(mipmap.manual):
See Comment 6: https://bugs.freedesktop.org/show_bug.cgi?id=54296
NOTE: This is a candidate for stable branches.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tom Stellard [Mon, 17 Sep 2012 14:31:31 +0000 (14:31 +0000)]
clover: Initialize height and depth to 1 for transfers
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tom Stellard [Thu, 13 Sep 2012 14:53:32 +0000 (14:53 +0000)]
pipe-loader: Remove a few debug_printfs
On debug builds these were always being printed.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Tom Stellard [Thu, 13 Sep 2012 15:21:42 +0000 (15:21 +0000)]
radeon/llvm: Handle loads from the constants address space.
Reading from constant memory is not supported yet, so constant reads use
global memory.
Tom Stellard [Thu, 13 Sep 2012 15:20:46 +0000 (15:20 +0000)]
radeon/llvm: Add support for v4f32 stores on R600
Tom Stellard [Thu, 13 Sep 2012 15:19:48 +0000 (15:19 +0000)]
radeon/llvm: Add support for i8 reads on R600
Tom Stellard [Thu, 13 Sep 2012 15:14:26 +0000 (15:14 +0000)]
radeon/llvm: Expand vector fadd and fmul on R600
Tom Stellard [Thu, 13 Sep 2012 15:08:40 +0000 (15:08 +0000)]
radeon/llvm: Add optimization for FP_ROUND
Tom Stellard [Thu, 13 Sep 2012 15:04:15 +0000 (15:04 +0000)]
radeon/llvm: Replace AMDGPU pow intrinsic with the llvm version
Paul Berry [Thu, 13 Sep 2012 03:51:07 +0000 (20:51 -0700)]
i965/blorp: Fix narrowing warnings.
Blorp has to convert rectangle coordinates from integers to floats in
order to send them down the GPU pipeline. Recent versions of GCC
issue a warning for this, since a float is not capable of precisely
representing all possible 32-bit integer values. Suppress the warning
with an explicit type cast in the case of blorp, since rectangle
coordinates will never be large enough to cause a loss of precision.
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Thu, 20 Sep 2012 23:31:15 +0000 (16:31 -0700)]
i965: Remove brw_set_predicate_inverse(p, true) from scratch offset code
Given that it exists between a push/pop of instruction state, this call
can only affect the MOV or ADD instruction generated just below it.
Neither of those instructions are predicated, so it makes no sense to
ask for the inverse predicate.
This fixes grumblings from the simulator debugger, which was
complaining about an invalid predicate.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Kenneth Graunke [Wed, 19 Sep 2012 19:01:14 +0000 (12:01 -0700)]
mesa: Don't override S3TC internalFormat if data is pre-compressed.
Commit
42723d88d intended to override an S3TC internalFormat to a
generic compressed format when the application requested online
compression of uncompressed data. Unfortunately, it also broke
pre-compressed textures when libtxc_dxtn isn't installed but the
extensions are forced on.
Both glCompressedTexImage2D() and glTexImage2D() call teximage(), which
calls _mesa_choose_texture_format(), hitting this override code. If we
have actual S3TC source data, we can't treat it as any other format, and
need to avoid the override.
Since glCompressedTexImage2D() passes in a format of GL_NONE (which is
illegal for glTexImage), we can use that to detect the pre-compressed
case and avoid the overrides.
Fixes a regression since
42723d88d370a7599398cc1c2349aeb951ba1c57.
NOTE: This is a candidate for the 9.0 branch.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-and-tested-by: Jordan Justen <jordan.l.justen@intel.com>
Kenneth Graunke [Tue, 11 Sep 2012 23:20:43 +0000 (16:20 -0700)]
i965/blorp: Add support for blits between SRGB and linear formats.
Fixes colorspace issues in L4D2 when multisampling is enabled (the
scene was far too dark, but the flashlight area was way too bright).
The nVidia and AMD binary drivers both allow this kind of blit.
NOTE: This is a candidate for the 9.0 branch.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Kenneth Graunke [Tue, 4 Sep 2012 18:29:30 +0000 (11:29 -0700)]
mesa: Ignore SRGB when determining compatible resolve formats.
MSAA resolves and other blit-like operations ignore SRGB state anyway,
so we should be able to safely allow resolves between compatible
SRGB/linear formats like SRGBA8 and RGBA8888.
This matches the behavior of the nVidia and AMD binary drivers.
Fixes completely black rendering when using multisampling in L4D2.
NOTE: This is a candidate for the 9.0 branch.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Andreas Boll [Thu, 20 Sep 2012 14:23:15 +0000 (16:23 +0200)]
docs: update some more FAQs
v2: remove mention of XFree86
Reviewed-by: Brian Paul <brianp@vmware.com>
Andreas Boll [Thu, 20 Sep 2012 14:01:37 +0000 (16:01 +0200)]
docs: remove utility.html
This page is very old and some of the links are dead.
Reviewed-by: Brian Paul <brianp@vmware.com>
Andreas Boll [Thu, 20 Sep 2012 14:01:35 +0000 (16:01 +0200)]
docs: remove science.html
This page is very old and some of the links are dead.
Reviewed-by: Brian Paul <brianp@vmware.com>
Andreas Boll [Thu, 20 Sep 2012 14:01:32 +0000 (16:01 +0200)]
docs: remove modelers.html
This page is very old and some of the links are dead.
Reviewed-by: Brian Paul <brianp@vmware.com>
Andreas Boll [Thu, 20 Sep 2012 14:01:24 +0000 (16:01 +0200)]
docs: remove libraries.html
This page is very old and some of the links are dead.
Reviewed-by: Brian Paul <brianp@vmware.com>
Andreas Boll [Thu, 20 Sep 2012 14:01:21 +0000 (16:01 +0200)]
docs: remove games.html
This page is very old and some of the links are dead.
Reviewed-by: Brian Paul <brianp@vmware.com>
Andreas Boll [Thu, 20 Sep 2012 14:01:18 +0000 (16:01 +0200)]
docs/contents: add autoconf.html link
make it easier to find the docs/autoconf.html site
Reviewed-by: Brian Paul <brianp@vmware.com>
Andreas Boll [Thu, 20 Sep 2012 14:01:15 +0000 (16:01 +0200)]
docs: convert last traces of progs to mesa/demos repository
v2: fix typo
Reviewed-by: Brian Paul <brianp@vmware.com>
Andreas Boll [Thu, 20 Sep 2012 14:01:12 +0000 (16:01 +0200)]
docs: add IRC info
Reviewed-by: Brian Paul <brianp@vmware.com>
Andreas Boll [Thu, 20 Sep 2012 14:01:08 +0000 (16:01 +0200)]
docs/egl: improve markup
replace unordered list <ul> with defined list <dl>
Reviewed-by: Brian Paul <brianp@vmware.com>
Andreas Boll [Thu, 20 Sep 2012 14:01:03 +0000 (16:01 +0200)]
docs/autoconf: improve markup
replace unordered list <ul> with defined list <dl>
Reviewed-by: Brian Paul <brianp@vmware.com>
Andreas Boll [Thu, 20 Sep 2012 14:00:52 +0000 (16:00 +0200)]
docs/autoconf: remove obsolete demo options
removed with commit
56c3cce2a199f7f79a48d7633431e1e80fcd4ba2
two years ago
Reviewed-by: Brian Paul <brianp@vmware.com>
Andreas Boll [Thu, 20 Sep 2012 13:22:37 +0000 (15:22 +0200)]
docs: improve quality of gears.png
Reviewed-by: Brian Paul <brianp@vmware.com>
Brian Paul [Wed, 19 Sep 2012 18:43:38 +0000 (12:43 -0600)]
gallium: mention PIPE_TIMEOUT_INFINITE in the fence_finish() comment
Brian Paul [Thu, 20 Sep 2012 15:13:37 +0000 (09:13 -0600)]
llvmpipe: fix overflow bug in total texture size computation
v2: use uint64_t for the total_size variable, per Jose.
Also add two earlier checks for exceeding the max texture size.
For example a 1K^3 RGBA volume would overflow the lpr->image_stride
variable.
Use simple algebra to avoid overflow in intermediate values.
So instead of "x * y > z" use "x > z / y".
This should work if we happen to be on a platform that doesn't have
64-bit types.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Alex Deucher [Thu, 20 Sep 2012 15:16:36 +0000 (11:16 -0400)]
r600g/llvm: rs780/rs880 are r600 asics
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Ian Romanick [Tue, 18 Sep 2012 13:19:18 +0000 (15:19 +0200)]
mesa: Allow glGetTexParameter of GL_TEXTURE_SRGB_DECODE_EXT
This was already (correctly) supported for glGetSamplerParameter paths.
NOTE: This is a candidate for stable branches.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Tom Stellard [Thu, 6 Sep 2012 04:20:27 +0000 (00:20 -0400)]
r300/compiler: Use precomputed q values in the register allocator
Tom Stellard [Thu, 6 Sep 2012 04:20:27 +0000 (00:20 -0400)]
r300g: Init regalloc state during context creation
Initializing the regalloc state is expensive, and since it is always
the same for every compile we only need to initialize it once per
context. This should help improve shader compile times for the driver.
Tom Stellard [Mon, 3 Sep 2012 12:25:13 +0000 (08:25 -0400)]
r300/compiler: Don't create register classes for inputs
Tom Stellard [Mon, 3 Sep 2012 14:43:45 +0000 (10:43 -0400)]
ra: Add q_values parameter to ra_set_finalize()
This allows the user to pass precomputed q values to the allocator.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tom Stellard [Mon, 3 Sep 2012 12:23:02 +0000 (08:23 -0400)]
ra: Clarify usage of ra_set_node_reg()
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tom Stellard [Wed, 1 Aug 2012 20:42:53 +0000 (20:42 +0000)]
r600g: Invalidate texture cache when creating vertex buffers for compute v2
Compute shaders fetch data from vertex buffers via the texture cache, so
we need to make sure the texture cache is flushed.
v2:
- Fix rebase mistake
- Fix spelling in comment
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Tom Stellard [Mon, 17 Sep 2012 14:33:56 +0000 (14:33 +0000)]
r600g: Use LOOP_START_DX10 for loops
LOOP_START_DX10 ignores the LOOP_CONFIG* registers, so it is not limited
to 4096 iterations like the other LOOP_* instructions. Compute shaders
need to use this instruction, and since we aren't optimizing loops with
the LOOP_CONFIG* registers for pixel and vertex shaders, it seems like
we should just use it for everything.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Tom Stellard [Thu, 13 Sep 2012 17:15:57 +0000 (17:15 +0000)]
r600g: Set the correct value of COLOR*_DIM for RATs
For buffers (which is what is being used for RATs), the
COLOR*_DIM.WIDTH_MASK field needs to be set to the low 16-bits of the
buffer size, and the COLOR*_DIM.HEIEGHT_MAX needs to be set to the
high bits.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Tom Stellard [Thu, 13 Sep 2012 17:14:56 +0000 (17:14 +0000)]
r600g: Make sure to initialize DB_DEPTH_CONTROL register for compute
The kernel CS checker will fail if this register is not initialized.
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Tom Stellard [Thu, 13 Sep 2012 14:37:53 +0000 (14:37 +0000)]
r600g: Add some comments and debug printfs to compute code
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Tom Stellard [Wed, 19 Sep 2012 19:27:32 +0000 (15:27 -0400)]
r600g: Add missing break to case statement
Michal Sciubidlo [Wed, 12 Sep 2012 06:57:01 +0000 (08:57 +0200)]
radeon/llvm: Emit ISA for ALU instructions in the R600 code emitter
Signed-off-by: Tom Stellard <thomas.stellard@amd.com>