docs/specs/INTEL_shader_atomic_float_minmax.txt

   1 Name
   2
   3     INTEL_shader_atomic_float_minmax
   4
   5 Name Strings
   6
   7     GL_INTEL_shader_atomic_float_minmax
   8
   9 Contact
  10
  11     Ian Romanick (ian . d . romanick 'at' intel . com)
  12
  13 Contributors
  14
  15
  16 Status
  17
  18     In progress
  19
  20 Version
  21
  22     Last Modified Date: 06/22/2018
  23     Revision: 4
  24
  25 Number
  26
  27     TBD
  28
  29 Dependencies
  30
  31     OpenGL 4.2, OpenGL ES 3.1, ARB_shader_storage_buffer_object, or
  32     ARB_compute_shader is required.
  33
  34     This extension is written against version 4.60 of the OpenGL Shading
  35     Language Specification.
  36
  37 Overview
  38
  39     This extension provides GLSL built-in functions allowing shaders to
  40     perform atomic read-modify-write operations to floating-point buffer
  41     variables and shared variables.  Minimum, maximum, exchange, and
  42     compare-and-swap are enabled.
  43
  44
  45 New Procedures and Functions
  46
  47     None.
  48
  49 New Tokens
  50
  51     None.
  52
  53 IP Status
  54
  55     None.
  56
  57 Modifications to the OpenGL Shading Language Specification, Version 4.60
  58
  59     Including the following line in a shader can be used to control the
  60     language features described in this extension:
  61
  62       #extension GL_INTEL_shader_atomic_float_minmax : <behavior>
  63
  64     where <behavior> is as specified in section 3.3.
  65
  66     New preprocessor #defines are added to the OpenGL Shading Language:
  67
  68       #define GL_INTEL_shader_atomic_float_minmax   1
  69
  70 Additions to Chapter 8 of the OpenGL Shading Language Specification
  71 (Built-in Functions)
  72
  73     Modify Section 8.11, "Atomic Memory Functions"
  74
  75     (add a new row after the existing "atomicMin" table row, p. 179)
  76
  77         float atomicMin(inout float mem, float data)
  78
  79
  80         Computes a new value by taking the minimum of the value of data and
  81         the contents of mem.  If one of these is an IEEE signaling NaN (i.e.,
  82         a NaN with the most-significant bit of the mantissa cleared), it is
  83         always considered smaller.  If one of these is an IEEE quiet NaN
  84         (i.e., a NaN with the most-significant bit of the mantissa set), it is
  85         always considered larger.  If both are IEEE quiet NaNs or both are
  86         IEEE signaling NaNs, the result of the comparison is undefined.
  87
  88     (add a new row after the exiting "atomicMax" table row, p. 179)
  89
  90         float atomicMax(inout float mem, float data)
  91
  92         Computes a new value by taking the maximum of the value of data and
  93         the contents of mem.  If one of these is an IEEE signaling NaN (i.e.,
  94         a NaN with the most-significant bit of the mantissa cleared), it is
  95         always considered larger.  If one of these is an IEEE quiet NaN (i.e.,
  96         a NaN with the most-significant bit of the mantissa set), it is always
  97         considered smaller.  If both are IEEE quiet NaNs or both are IEEE
  98         signaling NaNs, the result of the comparison is undefined.
  99
 100     (add to "atomicExchange" table cell, p. 180)
 101
 102         float atomicExchange(inout float mem, float data)
 103
 104     (add to "atomicCompSwap" table cell, p. 180)
 105
 106         float atomicCompSwap(inout float mem, float compare, float data)
 107
 108 Interactions with OpenGL 4.6 and ARB_gl_spirv
 109
 110     If OpenGL 4.6 or ARB_gl_spirv is supported, then
 111     SPV_INTEL_shader_atomic_float_minmax must also be supported.
 112
 113     The AtomicFloatMinmaxINTEL capability is available whenever the OpenGL or
 114     OpenGL ES implementation supports INTEL_shader_atomic_float_minmax.
 115
 116 Issues
 117
 118     1) Why call this extension INTEL_shader_atomic_float_minmax?
 119
 120     RESOLVED: Several other extensions already set the precedent of
 121     VENDOR_shader_atomic_float and VENDOR_shader_atomic_float64 for extensions
 122     that enable floating-point atomic operations.  Using that as a base for
 123     the name seems logical.
 124
 125     There already exists NV_shader_atomic_float, but the two extensions have
 126     nearly zero overlap in functionality.  NV_shader_atomic_float adds
 127     atomicAdd and image atomic operations that currently shipping Intel GPUs
 128     do not support.  Calling this extension INTEL_shader_atomic_float would
 129     likely have been confusing.
 130
 131     Adding something to describe the actual functions added by this extension
 132     seemed reasonable.  INTEL_shader_atomic_float_compare was considered, but
 133     that name was deemed to be not properly descriptive.  Calling this
 134     extension INTEL_shader_atomic_float_min_max_exchange_compswap is right
 135     out.
 136
 137     2) What atomic operations should we support for floating-point targets?
 138
 139     RESOLVED.  Exchange, min, max, and compare-swap make sense, and these are
 140     all supported by the hardware.  Future extensions may add other functions.
 141
 142     For buffer variables and shared variables it is not possible to bit-cast
 143     the memory location in GLSL, so existing integer operations, such as
 144     atomicOr, cannot be used.  However, the underlying hardware implementation
 145     can do this by treating the memory as an integer.  It would be possible to
 146     implement atomicNegate using this technique with atomicXor.  It is unclear
 147     whether this provides any actual utility.
 148
 149     3) What should be said about the NaN behavior?
 150
 151     RESOLVED.  There are several aspects of NaN behavior that should be
 152     documented in this extension.  However, some of this behavior varies based
 153     on NaN concepts that do not exist in the GLSL specification.
 154
 155     * atomicCompSwap performs the comparison as the floating-point equality
 156       operator (==).  That is, if either 'mem' or 'compare' is NaN, the
 157       comparison result is always false.
 158
 159     * atomicMin and atomicMax implement the IEEE specification with respect to
 160       NaN.  IEEE considers two different kinds of NaN: signaling NaN and quiet
 161       NaN.  A quiet NaN has the most significant bit of the mantissa set, and
 162       a signaling NaN does not.  This concept does not exist in SPIR-V,
 163       Vulkan, or OpenGL.  Let qNaN denote a quiet NaN and sNaN denote a
 164       signaling NaN.  atomicMin and atomicMax specifically implement
 165
 166       - fmin(qNaN, x) = fmin(x, qNaN) = fmax(qNaN, x) = fmax(x, qNaN) = x
 167       - fmin(sNaN, x) = fmin(x, sNaN) = fmax(sNaN, x) = fmax(x, sNaN) = sNaN
 168       - fmin(sNaN, qNaN) = fmin(qNaN, sNaN) = fmax(sNaN, qNaN) =
 169         fmax(qNaN, sNaN) = sNaN
 170       - fmin(sNaN, sNaN) = sNaN.  This specification does not define which of
 171         the two arguments is stored.
 172       - fmax(sNaN, sNaN) = sNaN.  This specification does not define which of
 173         the two arguments is stored.
 174       - fmin(qNaN, qNaN) = qNaN.  This specification does not define which of
 175         the two arguments is stored.
 176       - fmax(qNaN, qNaN) = qNaN.  This specification does not define which of
 177         the two arguments is stored.
 178
 179     Further details are available in the Skylake Programmer's Reference
 180     Manuals available at
 181     https://01.org/linuxgraphics/documentation/hardware-specification-prms.
 182
 183     4) What about atomicMin and atomicMax with (+0.0, -0.0) or (-0.0, +0.0)
 184     arguments?
 185
 186     RESOLVED.  atomicMin should store -0.0, and atomicMax should store +0.0.
 187     Due to a known issue in shipping Skylake GPUs, the incorrectly signed 0 is
 188     stored.  This behavior may change in later GPUs.
 189
 190 Revision History
 191
 192     Rev  Date        Author    Changes
 193     ---  ----------  --------  ---------------------------------------------
 194       1  04/19/2018  idr       Initial version
 195       2  05/05/2018  idr       Describe interactions with the capabilities
 196                                added by SPV_INTEL_shader_atomic_float_minmax.
 197       3  05/29/2018  idr       Remove mention of 64-bit float support.
 198       4  06/22/2018  idr       Resolve issue #2.
 199                                Add issue #3 (regarding NaN behavior).
 200                                Add issue #4 (regarding atomicMin(-0, +0).