openpower/sv.mdwn

   1 [[!tag standards]]
   2
   3 # Simple-V Vectorisation for the OpenPOWER ISA
   4
   5 **SV is in DRAFT STATUS**. SV has not yet been submitted to the OpenPOWER Foundation ISA WG for review.
   6
   7 <https://bugs.libre-soc.org/show_bug.cgi?id=213>
   8
   9 Fundamental design principles:
  10
  11 * Simplicity of introduction and implementation on the existing OpenPOWER ISA
  12 * Effectively a hardware for-loop, pausing PC, issuing multiple scalar operations
  13 * Preserving the underlying scalar execution dependencies as if the for-loop had been expanded as actual scalar instructions
  14   (termed "preserving Program Order")
  15 * Augments ("tags") existing instructions, providing Vectorisation "context" rather than adding new ones.
  16 * Does not modify or deviate from the underlying scalar OpenPOWER ISA unless it provides significant performance or other advantage to do so in the Vector space (dropping XER.SO and OE=1 for example)
  17 * Designed for Supercomputing: avoids creating significant sequential
  18 dependency hazards, allowing high performance superscalar microarchitectures to be deployed.
  19
  20 Advantages of these design principles:
  21
  22 * It is therefore easy to create a first (and sometimes only) implementation as literally a for-loop in hardware, simulators, and compilers.
  23 * More complex HDL can be done by repeating existing scalar ALUs and pipelines as blocks.
  24 * As (mostly) a high-level "context" that does not (significantly) deviate from scalar OpenPOWER ISA and, in its purest form being "a for loop around scalar instructions", it is minimally-disruptive and consequently stands a reasonable chance of broad community adoption and acceptance
  25 * Completely wipes not just SIMD opcode proliferation off the
  26   map (SIMD is O(N^6) opcode proliferation)
  27   but off of Vectorisation ISAs as well.  No more separate Vector
  28   instructions.
  29
  30 Pages being developed and examples
  31
  32 * [[sv/overview]] explaining the basics.
  33 * [[sv/implementation]] implementation planning and coordination
  34 * [[sv/svp64]] contains the packet-format *only*
  35 * [[sv/setvl]] the Cray-style "Vector Length" instruction
  36 * [[sv/predication]] discussion on predication concepts
  37 * [[sv/cr_int_predication]] instructions needed for effective predication
  38 * [[sv/masked_vector_chaining]]
  39 * [[sv/discussion]]
  40 * [[sv/example_dep_matrices]]
  41 * [[sv/major_opcode_allocation]]
  42 * [[opcode_regs_deduped]]
  43 * [[sv/vector_swizzle]]
  44 * [[sv/register_type_tags]]
  45 * [[sv/mv.swizzle]]
  46 * [[sv/mv.x]]
  47 * [[sv/branches]] - SVP64 Conditional Branch behaviour: All/Some Vector CRs
  48 * [[sv/cr_ops]] - SVP64 Condition Register ops: Guidelines
  49  on Vectorisation of any v3.0B base operations which return
  50  or modify a Condition Register bit or field.
  51 * [[sv/fcvt]] FP Conversion (due to OpenPOWER Scalar FP32)
  52 * [[sv/fclass]] detect class of FP numbers
  53 * [[sv/int_fp_mv]] Move and convert GPR <-> FPR, needed for !VSX
  54 * [[sv/mv.vec]] move to and from vec2/3/4
  55 * [[sv/16_bit_compressed]] experimental
  56 * [[sv/toc_data_pointer]] experimental
  57 * [[sv/ldst]] Load and Store
  58 * [[sv/sprs]] SPRs
  59 * [[sv/bitmanip]]
  60 * [[sv/remap]] "Remapping" for Matrix Multiply and RGB "Structure Packing"
  61 * [[sv/propagation]] Context propagation including svp64, swizzle and remap
  62 * [[sv/vector_ops]] Vector ops needed to make a "complete" Vector ISA
  63 * [[sv/av_opcodes]] scalar opcodes for Audio/Video
  64 * [[sv/byteswap]]
  65 * TODO: OpenPOWER [[openpower/transcendentals]]
  66
  67 Additional links:
  68
  69 * <https://www.sigarch.org/simd-instructions-considered-harmful/>
  70 * [[simple_v_extension]] old (deprecated) version
  71 * [[openpower/sv/llvm]]
  72
  73 Obligatory Dilbert:
  74
  75 <img src="https://assets.amuniversal.com/7fada35026ca01393d3d005056a9545d" width="600" />
  76