docs/specs/MESA_shader_integer_functions.txt

   1 Name
   2
   3     MESA_shader_integer_functions
   4
   5 Name Strings
   6
   7     GL_MESA_shader_integer_functions
   8
   9 Contact
  10
  11     Ian Romanick <ian.d.romanick@intel.com>
  12
  13 Contributors
  14
  15     All the contributors of GL_ARB_gpu_shader5
  16
  17 Status
  18
  19     Supported by all GLSL 1.30 capable drivers in Mesa 12.1 and later
  20
  21 Version
  22
  23     Version 3, March 31, 2017
  24
  25 Number
  26
  27     OpenGL Extension #495
  28
  29 Dependencies
  30
  31     This extension is written against the OpenGL 3.2 (Compatibility Profile)
  32     Specification.
  33
  34     This extension is written against Version 1.50 (Revision 09) of the OpenGL
  35     Shading Language Specification.
  36
  37     GLSL 1.30 (OpenGL) or GLSL ES 3.00 (OpenGL ES) is required.
  38
  39     This extension interacts with ARB_gpu_shader5.
  40
  41     This extension interacts with ARB_gpu_shader_fp64.
  42
  43     This extension interacts with NV_gpu_shader5.
  44
  45 Overview
  46
  47     GL_ARB_gpu_shader5 extends GLSL in a number of useful ways.  Much of this
  48     added functionality requires significant hardware support.  There are many
  49     aspects, however, that can be easily implmented on any GPU with "real"
  50     integer support (as opposed to simulating integers using floating point
  51     calculations).
  52
  53     This extension provides a set of new features to the OpenGL Shading
  54     Language to support capabilities of these GPUs, extending the
  55     capabilities of version 1.30 of the OpenGL Shading Language and version
  56     3.00 of the OpenGL ES Shading Language.  Shaders using the new
  57     functionality provided by this extension should enable this
  58     functionality via the construct
  59
  60       #extension GL_MESA_shader_integer_functions : require   (or enable)
  61
  62     This extension provides a variety of new features for all shader types,
  63     including:
  64
  65       * support for implicitly converting signed integer types to unsigned
  66         types, as well as more general implicit conversion and function
  67         overloading infrastructure to support new data types introduced by
  68         other extensions;
  69
  70       * new built-in functions supporting:
  71
  72         * splitting a floating-point number into a significand and exponent
  73           (frexp), or building a floating-point number from a significand and
  74           exponent (ldexp);
  75
  76         * integer bitfield manipulation, including functions to find the
  77           position of the most or least significant set bit, count the number
  78           of one bits, and bitfield insertion, extraction, and reversal;
  79
  80         * extended integer precision math, including add with carry, subtract
  81           with borrow, and extenended multiplication;
  82
  83     The resulting extension is a strict subset of GL_ARB_gpu_shader5.
  84
  85 IP Status
  86
  87     No known IP claims.
  88
  89 New Procedures and Functions
  90
  91     None
  92
  93 New Tokens
  94
  95     None
  96
  97 Additions to Chapter 2 of the OpenGL 3.2 (Compatibility Profile) Specification
  98 (OpenGL Operation)
  99
 100     None.
 101
 102 Additions to Chapter 3 of the OpenGL 3.2 (Compatibility Profile) Specification
 103 (Rasterization)
 104
 105     None.
 106
 107 Additions to Chapter 4 of the OpenGL 3.2 (Compatibility Profile) Specification
 108 (Per-Fragment Operations and the Frame Buffer)
 109
 110     None.
 111
 112 Additions to Chapter 5 of the OpenGL 3.2 (Compatibility Profile) Specification
 113 (Special Functions)
 114
 115     None.
 116
 117 Additions to Chapter 6 of the OpenGL 3.2 (Compatibility Profile) Specification
 118 (State and State Requests)
 119
 120     None.
 121
 122 Additions to Appendix A of the OpenGL 3.2 (Compatibility Profile)
 123 Specification (Invariance)
 124
 125     None.
 126
 127 Additions to the AGL/GLX/WGL Specifications
 128
 129     None.
 130
 131 Modifications to The OpenGL Shading Language Specification, Version 1.50
 132 (Revision 09)
 133
 134     Including the following line in a shader can be used to control the
 135     language features described in this extension:
 136
 137       #extension GL_MESA_shader_integer_functions : <behavior>
 138
 139     where <behavior> is as specified in section 3.3.
 140
 141     New preprocessor #defines are added to the OpenGL Shading Language:
 142
 143       #define GL_MESA_shader_integer_functions        1
 144
 145
 146     Modify Section 4.1.10, Implicit Conversions, p. 27
 147
 148     (modify table of implicit conversions)
 149
 150                                 Can be implicitly
 151         Type of expression        converted to
 152         ---------------------   -----------------
 153         int                     uint, float
 154         ivec2                   uvec2, vec2
 155         ivec3                   uvec3, vec3
 156         ivec4                   uvec4, vec4
 157
 158         uint                    float
 159         uvec2                   vec2
 160         uvec3                   vec3
 161         uvec4                   vec4
 162
 163     (modify second paragraph of the section) No implicit conversions are
 164     provided to convert from unsigned to signed integer types or from
 165     floating-point to integer types.  There are no implicit array or structure
 166     conversions.
 167
 168     (insert before the final paragraph of the section) When performing
 169     implicit conversion for binary operators, there may be multiple data types
 170     to which the two operands can be converted.  For example, when adding an
 171     int value to a uint value, both values can be implicitly converted to uint
 172     and float.  In such cases, a floating-point type is chosen if either
 173     operand has a floating-point type.  Otherwise, an unsigned integer type is
 174     chosen if either operand has an unsigned integer type.  Otherwise, a
 175     signed integer type is chosen.
 176
 177
 178     Modify Section 5.9, Expressions, p. 57
 179
 180     (modify bulleted list as follows, adding support for implicit conversion
 181     between signed and unsigned types)
 182
 183     Expressions in the shading language are built from the following:
 184
 185     * Constants of type bool, int, int64_t, uint, uint64_t, float, all vector
 186       types, and all matrix types.
 187
 188     ...
 189
 190     * The operator modulus (%) operates on signed or unsigned integer scalars
 191       or vectors.  If the fundamental types of the operands do not match, the
 192       conversions from Section 4.1.10 "Implicit Conversions" are applied to
 193       produce matching types.  ...
 194
 195
 196     Modify Section 6.1, Function Definitions, p. 63
 197
 198     (modify description of overloading, beginning at the top of p. 64)
 199
 200      Function names can be overloaded.  The same function name can be used for
 201      multiple functions, as long as the parameter types differ.  If a function
 202      name is declared twice with the same parameter types, then the return
 203      types and all qualifiers must also match, and it is the same function
 204      being declared.  For example,
 205
 206        vec4 f(in vec4 x, out vec4  y);   // (A)
 207        vec4 f(in vec4 x, out uvec4 y);   // (B) okay, different argument type
 208        vec4 f(in ivec4 x, out uvec4 y);  // (C) okay, different argument type
 209
 210        int  f(in vec4 x, out ivec4 y);  // error, only return type differs
 211        vec4 f(in vec4 x, in  vec4  y);  // error, only qualifier differs
 212        vec4 f(const in vec4 x, out vec4 y);  // error, only qualifier differs
 213
 214      When function calls are resolved, an exact type match for all the
 215      arguments is sought.  If an exact match is found, all other functions are
 216      ignored, and the exact match is used.  If no exact match is found, then
 217      the implicit conversions in Section 4.1.10 (Implicit Conversions) will be
 218      applied to find a match.  Mismatched types on input parameters (in or
 219      inout or default) must have a conversion from the calling argument type
 220      to the formal parameter type.  Mismatched types on output parameters (out
 221      or inout) must have a conversion from the formal parameter type to the
 222      calling argument type.
 223
 224      If implicit conversions can be used to find more than one matching
 225      function, a single best-matching function is sought.  To determine a best
 226      match, the conversions between calling argument and formal parameter
 227      types are compared for each function argument and pair of matching
 228      functions.  After these comparisons are performed, each pair of matching
 229      functions are compared.  A function definition A is considered a better
 230      match than function definition B if:
 231
 232        * for at least one function argument, the conversion for that argument
 233          in A is better than the corresponding conversion in B; and
 234
 235        * there is no function argument for which the conversion in B is better
 236          than the corresponding conversion in A.
 237
 238      If a single function definition is considered a better match than every
 239      other matching function definition, it will be used.  Otherwise, a
 240      semantic error occurs and the shader will fail to compile.
 241
 242      To determine whether the conversion for a single argument in one match is
 243      better than that for another match, the following rules are applied, in
 244      order:
 245
 246        1. An exact match is better than a match involving any implicit
 247           conversion.
 248
 249        2. A match involving an implicit conversion from float to double is
 250           better than a match involving any other implicit conversion.
 251
 252        3. A match involving an implicit conversion from either int or uint to
 253           float is better than a match involving an implicit conversion from
 254           either int or uint to double.
 255
 256      If none of the rules above apply to a particular pair of conversions,
 257      neither conversion is considered better than the other.
 258
 259      For the function prototypes (A), (B), and (C) above, the following
 260      examples show how the rules apply to different sets of calling argument
 261      types:
 262
 263        f(vec4, vec4);        // exact match of vec4 f(in vec4 x, out vec4 y)
 264        f(vec4, uvec4);       // exact match of vec4 f(in vec4 x, out ivec4 y)
 265        f(vec4, ivec4);       // matched to vec4 f(in vec4 x, out vec4 y)
 266                              //   (C) not relevant, can't convert vec4 to
 267                              //   ivec4.  (A) better than (B) for 2nd
 268                              //   argument (rule 2), same on first argument.
 269        f(ivec4, vec4);       // NOT matched.  All three match by implicit
 270                              //   conversion.  (C) is better than (A) and (B)
 271                              //   on the first argument.  (A) is better than
 272                              //   (B) and (C).
 273
 274
 275     Modify Section 8.3, Common Functions, p. 84
 276
 277     (add support for single-precision frexp and ldexp functions)
 278
 279     Syntax:
 280
 281       genType frexp(genType x, out genIType exp);
 282       genType ldexp(genType x, in genIType exp);
 283
 284     The function frexp() splits each single-precision floating-point number in
 285     <x> into a binary significand, a floating-point number in the range [0.5,
 286     1.0), and an integral exponent of two, such that:
 287
 288       x = significand * 2 ^ exponent
 289
 290     The significand is returned by the function; the exponent is returned in
 291     the parameter <exp>.  For a floating-point value of zero, the significant
 292     and exponent are both zero.  For a floating-point value that is an
 293     infinity or is not a number, the results of frexp() are undefined.
 294
 295     If the input <x> is a vector, this operation is performed in a
 296     component-wise manner; the value returned by the function and the value
 297     written to <exp> are vectors with the same number of components as <x>.
 298
 299     The function ldexp() builds a single-precision floating-point number from
 300     each significand component in <x> and the corresponding integral exponent
 301     of two in <exp>, returning:
 302
 303       significand * 2 ^ exponent
 304
 305     If this product is too large to be represented as a single-precision
 306     floating-point value, the result is considered undefined.
 307
 308     If the input <x> is a vector, this operation is performed in a
 309     component-wise manner; the value passed in <exp> and returned by the
 310     function are vectors with the same number of components as <x>.
 311
 312
 313     (add support for new integer built-in functions)
 314
 315     Syntax:
 316
 317       genIType bitfieldExtract(genIType value, int offset, int bits);
 318       genUType bitfieldExtract(genUType value, int offset, int bits);
 319
 320       genIType bitfieldInsert(genIType base, genIType insert, int offset,
 321                               int bits);
 322       genUType bitfieldInsert(genUType base, genUType insert, int offset,
 323                               int bits);
 324
 325       genIType bitfieldReverse(genIType value);
 326       genUType bitfieldReverse(genUType value);
 327
 328       genIType bitCount(genIType value);
 329       genIType bitCount(genUType value);
 330
 331       genIType findLSB(genIType value);
 332       genIType findLSB(genUType value);
 333
 334       genIType findMSB(genIType value);
 335       genIType findMSB(genUType value);
 336
 337     The function bitfieldExtract() extracts bits <offset> through
 338     <offset>+<bits>-1 from each component in <value>, returning them in the
 339     least significant bits of corresponding component of the result.  For
 340     unsigned data types, the most significant bits of the result will be set
 341     to zero.  For signed data types, the most significant bits will be set to
 342     the value of bit <offset>+<base>-1.  If <bits> is zero, the result will be
 343     zero.  The result will be undefined if <offset> or <bits> is negative, or
 344     if the sum of <offset> and <bits> is greater than the number of bits used
 345     to store the operand.  Note that for vector versions of bitfieldExtract(),
 346     a single pair of <offset> and <bits> values is shared for all components.
 347
 348     The function bitfieldInsert() inserts the <bits> least significant bits of
 349     each component of <insert> into the corresponding component of <base>.
 350     The result will have bits numbered <offset> through <offset>+<bits>-1
 351     taken from bits 0 through <bits>-1 of <insert>, and all other bits taken
 352     directly from the corresponding bits of <base>.  If <bits> is zero, the
 353     result will simply be <base>.  The result will be undefined if <offset> or
 354     <bits> is negative, or if the sum of <offset> and <bits> is greater than
 355     the number of bits used to store the operand.  Note that for vector
 356     versions of bitfieldInsert(), a single pair of <offset> and <bits> values
 357     is shared for all components.
 358
 359     The function bitfieldReverse() reverses the bits of <value>.  The bit
 360     numbered <n> of the result will be taken from bit (<bits>-1)-<n> of
 361     <value>, where <bits> is the total number of bits used to represent
 362     <value>.
 363
 364     The function bitCount() returns the number of one bits in the binary
 365     representation of <value>.
 366
 367     The function findLSB() returns the bit number of the least significant one
 368     bit in the binary representation of <value>.  If <value> is zero, -1 will
 369     be returned.
 370
 371     The function findMSB() returns the bit number of the most significant bit
 372     in the binary representation of <value>.  For positive integers, the
 373     result will be the bit number of the most significant one bit.  For
 374     negative integers, the result will be the bit number of the most
 375     significant zero bit.  For a <value> of zero or negative one, -1 will be
 376     returned.
 377
 378
 379     (support for unsigned integer add/subtract with carry-out)
 380
 381     Syntax:
 382
 383       genUType uaddCarry(genUType x, genUType y, out genUType carry);
 384       genUType usubBorrow(genUType x, genUType y, out genUType borrow);
 385
 386     The function uaddCarry() adds 32-bit unsigned integers or vectors <x> and
 387     <y>, returning the sum modulo 2^32.  The value <carry> is set to zero if
 388     the sum was less than 2^32, or one otherwise.
 389
 390     The function usubBorrow() subtracts the 32-bit unsigned integer or vector
 391     <y> from <x>, returning the difference if non-negative or 2^32 plus the
 392     difference, otherwise.  The value <borrow> is set to zero if x >= y, or
 393     one otherwise.
 394
 395
 396     (support for signed and unsigned multiplies, with 32-bit inputs and a
 397      64-bit result spanning two 32-bit outputs)
 398
 399     Syntax:
 400
 401       void umulExtended(genUType x, genUType y, out genUType msb,
 402                         out genUType lsb);
 403       void imulExtended(genIType x, genIType y, out genIType msb,
 404                         out genIType lsb);
 405
 406     The functions umulExtended() and imulExtended() multiply 32-bit unsigned
 407     or signed integers or vectors <x> and <y>, producing a 64-bit result.  The
 408     32 least significant bits are returned in <lsb>; the 32 most significant
 409     bits are returned in <msb>.
 410
 411
 412 GLX Protocol
 413
 414     None.
 415
 416 Dependencies on ARB_gpu_shader_fp64
 417
 418     This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
 419     of implicit conversions supported in the OpenGL Shading Language.  If more
 420     than one of these extensions is supported, an expression of one type may
 421     be converted to another type if that conversion is allowed by any of these
 422     specifications.
 423
 424     If ARB_gpu_shader_fp64 or a similar extension introducing new data types
 425     is not supported, the function overloading rule in the GLSL specification
 426     preferring promotion an input parameters to smaller type to a larger type
 427     is never applicable, as all data types are of the same size.  That rule
 428     and the example referring to "double" should be removed.
 429
 430
 431 Dependencies on NV_gpu_shader5
 432
 433     This extension, ARB_gpu_shader_fp64, and NV_gpu_shader5 all modify the set
 434     of implicit conversions supported in the OpenGL Shading Language.  If more
 435     than one of these extensions is supported, an expression of one type may
 436     be converted to another type if that conversion is allowed by any of these
 437     specifications.
 438
 439     If NV_gpu_shader5 is supported, integer data types are supported with four
 440     different precisions (8-, 16, 32-, and 64-bit) and floating-point data
 441     types are supported with three different precisions (16-, 32-, and
 442     64-bit).  The extension adds the following rule for output parameters,
 443     which is similar to the one present in this extension for input
 444     parameters:
 445
 446        5. If the formal parameters in both matches are output parameters, a
 447           conversion from a type with a larger number of bits per component is
 448           better than a conversion from a type with a smaller number of bits
 449           per component.  For example, a conversion from an "int16_t" formal
 450           parameter type to "int"  is better than one from an "int8_t" formal
 451           parameter type to "int".
 452
 453     Such a rule is not provided in this extension because there is no
 454     combination of types in this extension and ARB_gpu_shader_fp64 where this
 455     rule has any effect.
 456
 457
 458 Errors
 459
 460     None
 461
 462
 463 New State
 464
 465     None
 466
 467 New Implementation Dependent State
 468
 469     None
 470
 471 Issues
 472
 473     (1) What should this extension be called?
 474
 475       UNRESOLVED.  This extension borrows from GL_ARB_gpu_shader5, so creating
 476       some sort of a play on that name would be viable.  However, nothing in
 477       this extension should require SM5 hardware, so such a name would be a
 478       little misleading and weird.
 479
 480       Since the primary purpose is to add integer related functions from
 481       GL_ARB_gpu_shader5, call this extension GL_MESA_shader_integer_functions
 482       for now.
 483
 484     (2) Why is some of the formatting in this extension weird?
 485
 486       RESOLVED: This extension is formatted to minimize the differences (as
 487       reported by 'diff --side-by-side -W180') with the GL_ARB_gpu_shader5
 488       specification.
 489
 490     (3) Should ldexp and frexp be included?
 491
 492       RESOLVED: Yes.  Few GPUs have native instructions to implement these
 493       functions.  These are generally implemented using existing GLSL built-in
 494       functions and the other functions provided by this extension.
 495
 496     (4) Should umulExtended and imulExtended be included?
 497
 498       RESOLVED: Yes.  These functions should be implementable on any GPU that
 499       can support the rest of this extension, but the implementation may be
 500       complex.  The implementation on a GPU that only supports 32bit x 32bit =
 501       32bit multiplication would be quite expensive.  However, many GPUs
 502       (including OpenGL 4.0 GPUs that already support this function) have a
 503       32bit x 16bit = 48bit multiplier.  The implementation there is only
 504       trivially more expensive than regular 32bit multiplication.
 505
 506     (5) Should the pack and unpack functions be included?
 507
 508       RESOLVED: No.  These functions are already available via
 509       GL_ARB_shading_language_packing.
 510
 511     (6) Should the "BitsTo" functions be included?
 512
 513       RESOLVED: No.  These functions are already available via
 514       GL_ARB_shader_bit_encoding.
 515
 516 Revision History
 517
 518     Rev.      Date     Author    Changes
 519     ----  -----------  --------  -----------------------------------------
 520      3    31-Mar-2017  Jon Leech Add ES support (OpenGL-Registry/issues/3)
 521      2     7-Jul-2016  idr       Fix typo in #extension line
 522      1    20-Jun-2016  idr       Initial version based on GL_ARB_gpu_shader5.