--- /dev/null
+Name
+
+ INTEL_shader_atomic_float_minmax
+
+Name Strings
+
+ GL_INTEL_shader_atomic_float_minmax
+
+Contact
+
+ Ian Romanick (ian . d . romanick 'at' intel . com)
+
+Contributors
+
+
+Status
+
+ In progress
+
+Version
+
+ Last Modified Date: 06/22/2018
+ Revision: 4
+
+Number
+
+ TBD
+
+Dependencies
+
+ OpenGL 4.2, OpenGL ES 3.1, ARB_shader_storage_buffer_object, or
+ ARB_compute_shader is required.
+
+ This extension is written against version 4.60 of the OpenGL Shading
+ Language Specification.
+
+Overview
+
+ This extension provides GLSL built-in functions allowing shaders to
+ perform atomic read-modify-write operations to floating-point buffer
+ variables and shared variables. Minimum, maximum, exchange, and
+ compare-and-swap are enabled.
+
+
+New Procedures and Functions
+
+ None.
+
+New Tokens
+
+ None.
+
+IP Status
+
+ None.
+
+Modifications to the OpenGL Shading Language Specification, Version 4.60
+
+ Including the following line in a shader can be used to control the
+ language features described in this extension:
+
+ #extension GL_INTEL_shader_atomic_float_minmax : <behavior>
+
+ where <behavior> is as specified in section 3.3.
+
+ New preprocessor #defines are added to the OpenGL Shading Language:
+
+ #define GL_INTEL_shader_atomic_float_minmax 1
+
+Additions to Chapter 8 of the OpenGL Shading Language Specification
+(Built-in Functions)
+
+ Modify Section 8.11, "Atomic Memory Functions"
+
+ (add a new row after the existing "atomicMin" table row, p. 179)
+
+ float atomicMin(inout float mem, float data)
+
+
+ Computes a new value by taking the minimum of the value of data and
+ the contents of mem. If one of these is an IEEE signaling NaN (i.e.,
+ a NaN with the most-significant bit of the mantissa cleared), it is
+ always considered smaller. If one of these is an IEEE quiet NaN
+ (i.e., a NaN with the most-significant bit of the mantissa set), it is
+ always considered larger. If both are IEEE quiet NaNs or both are
+ IEEE signaling NaNs, the result of the comparison is undefined.
+
+ (add a new row after the exiting "atomicMax" table row, p. 179)
+
+ float atomicMax(inout float mem, float data)
+
+ Computes a new value by taking the maximum of the value of data and
+ the contents of mem. If one of these is an IEEE signaling NaN (i.e.,
+ a NaN with the most-significant bit of the mantissa cleared), it is
+ always considered larger. If one of these is an IEEE quiet NaN (i.e.,
+ a NaN with the most-significant bit of the mantissa set), it is always
+ considered smaller. If both are IEEE quiet NaNs or both are IEEE
+ signaling NaNs, the result of the comparison is undefined.
+
+ (add to "atomicExchange" table cell, p. 180)
+
+ float atomicExchange(inout float mem, float data)
+
+ (add to "atomicCompSwap" table cell, p. 180)
+
+ float atomicCompSwap(inout float mem, float compare, float data)
+
+Interactions with OpenGL 4.6 and ARB_gl_spirv
+
+ If OpenGL 4.6 or ARB_gl_spirv is supported, then
+ SPV_INTEL_shader_atomic_float_minmax must also be supported.
+
+ The AtomicFloatMinmaxINTEL capability is available whenever the OpenGL or
+ OpenGL ES implementation supports INTEL_shader_atomic_float_minmax.
+
+Issues
+
+ 1) Why call this extension INTEL_shader_atomic_float_minmax?
+
+ RESOLVED: Several other extensions already set the precedent of
+ VENDOR_shader_atomic_float and VENDOR_shader_atomic_float64 for extensions
+ that enable floating-point atomic operations. Using that as a base for
+ the name seems logical.
+
+ There already exists NV_shader_atomic_float, but the two extensions have
+ nearly zero overlap in functionality. NV_shader_atomic_float adds
+ atomicAdd and image atomic operations that currently shipping Intel GPUs
+ do not support. Calling this extension INTEL_shader_atomic_float would
+ likely have been confusing.
+
+ Adding something to describe the actual functions added by this extension
+ seemed reasonable. INTEL_shader_atomic_float_compare was considered, but
+ that name was deemed to be not properly descriptive. Calling this
+ extension INTEL_shader_atomic_float_min_max_exchange_compswap is right
+ out.
+
+ 2) What atomic operations should we support for floating-point targets?
+
+ RESOLVED. Exchange, min, max, and compare-swap make sense, and these are
+ all supported by the hardware. Future extensions may add other functions.
+
+ For buffer variables and shared variables it is not possible to bit-cast
+ the memory location in GLSL, so existing integer operations, such as
+ atomicOr, cannot be used. However, the underlying hardware implementation
+ can do this by treating the memory as an integer. It would be possible to
+ implement atomicNegate using this technique with atomicXor. It is unclear
+ whether this provides any actual utility.
+
+ 3) What should be said about the NaN behavior?
+
+ RESOLVED. There are several aspects of NaN behavior that should be
+ documented in this extension. However, some of this behavior varies based
+ on NaN concepts that do not exist in the GLSL specification.
+
+ * atomicCompSwap performs the comparison as the floating-point equality
+ operator (==). That is, if either 'mem' or 'compare' is NaN, the
+ comparison result is always false.
+
+ * atomicMin and atomicMax implement the IEEE specification with respect to
+ NaN. IEEE considers two different kinds of NaN: signaling NaN and quiet
+ NaN. A quiet NaN has the most significant bit of the mantissa set, and
+ a signaling NaN does not. This concept does not exist in SPIR-V,
+ Vulkan, or OpenGL. Let qNaN denote a quiet NaN and sNaN denote a
+ signaling NaN. atomicMin and atomicMax specifically implement
+
+ - fmin(qNaN, x) = fmin(x, qNaN) = fmax(qNaN, x) = fmax(x, qNaN) = x
+ - fmin(sNaN, x) = fmin(x, sNaN) = fmax(sNaN, x) = fmax(x, sNaN) = sNaN
+ - fmin(sNaN, qNaN) = fmin(qNaN, sNaN) = fmax(sNaN, qNaN) =
+ fmax(qNaN, sNaN) = sNaN
+ - fmin(sNaN, sNaN) = sNaN. This specification does not define which of
+ the two arguments is stored.
+ - fmax(sNaN, sNaN) = sNaN. This specification does not define which of
+ the two arguments is stored.
+ - fmin(qNaN, qNaN) = qNaN. This specification does not define which of
+ the two arguments is stored.
+ - fmax(qNaN, qNaN) = qNaN. This specification does not define which of
+ the two arguments is stored.
+
+ Further details are available in the Skylake Programmer's Reference
+ Manuals available at
+ https://01.org/linuxgraphics/documentation/hardware-specification-prms.
+
+ 4) What about atomicMin and atomicMax with (+0.0, -0.0) or (-0.0, +0.0)
+ arguments?
+
+ RESOLVED. atomicMin should store -0.0, and atomicMax should store +0.0.
+ Due to a known issue in shipping Skylake GPUs, the incorrectly signed 0 is
+ stored. This behavior may change in later GPUs.
+
+Revision History
+
+ Rev Date Author Changes
+ --- ---------- -------- ---------------------------------------------
+ 1 04/19/2018 idr Initial version
+ 2 05/05/2018 idr Describe interactions with the capabilities
+ added by SPV_INTEL_shader_atomic_float_minmax.
+ 3 05/29/2018 idr Remove mention of 64-bit float support.
+ 4 06/22/2018 idr Resolve issue #2.
+ Add issue #3 (regarding NaN behavior).
+ Add issue #4 (regarding atomicMin(-0, +0).