-# TODO
+# Kazan
-intro. bit about hilariously not realising that spir-v was to be
-*compiled* (to LLVM-IR)...
+So after deciding to sponsor Jacob to work on a 3D Graphics Driver,
+for some reason I thought using rust would be a good idea. Normally,
+3D Graphics Drivers are written in c or c++ for performance reasons,
+however in this case I was attracted to the security and memory-safety
+inherent in rust.
-# Jacob stuff
+Hilariously, it wasn't until some time last week that the way that Vulkan
+works actually sank in. I thought it was some sort of interpreter of
+a 3D API, just like gallium3d: it most definitely is not. The core of
+Vulkan is [SPIR-V](https://en.wikipedia.org/wiki/SPIR-V), an Intermediate
+Representation (IR) language based on LLVM's IR. Originally developed
+for OpenCL Parallel Compute, somewhere along the line someone realised
+that SPIR-V would also do well for representing shaders in 3D applications.
-todo
+So whereas previously I was deeply concerned that I had made a huge mistake
+in using rust, actually, the rust driver isn't so much a "driver" as it
+is a **compiler**. As in: the purpose of a Vulkan implementation is to
+**compile** the 3D shader SPIR-V binary provided by the 3D application
+into something that will execute directly on the underlying hardware.
-# finishing up
+We chose to compile SPIR-V IR into LLVM IR, and for that task, the fact
+that the compiler is written in rust does **not** affect performance
+**in any way**. Once compiled to LLVM, the resultant IR will be handed
+to the standard LLVM JIT (Just-in-Time) low-level compiler, and it will
+execute **directly** as assembler, **and** it will execute in parallel,
+as well.
-todo
+Contrast this with gallium3d-llvmpipe where the API is **interpreted**
+(and also single-threaded).
+# Example
+
+So I asked Jacob if he could do a quick write-up of an
+[example translation](https://salsa.debian.org/Kazan-team/kazan/blob/master/docs/Example%20Translation%20from%20SPIR-V%20to%20LLVM%20IR.md).
+I wanted to see what goes on, as I quite like compilers and language
+translators. Also, a couple weeks back he ran into some roadblocks on
+how the data structures would work in the compiler, so I figured it
+would be nice to do a visual worked example.
+
+It looks pretty straightforward. Start at c-code, compile to SPIR-V
+(the writer of the 3D or OpenCL application does that part). The interesting
+bit is that SPIR-V kinda assumes a SIMD (or SIMT - which is basically
+"predicated SIMD") micro-architecture.
+
+Unlike in a standard sequential algorithm, branches are not done as
+"branches": they're done by testing a set of conditions (in parallel),
+which produces a bit-field of 1s and 0s (representing success or
+failure of each of the parallel compares), then the "THEN" part of the
+statement - bear in mind this is all parallel - will be executed on each
+element where its corresponding "predicate" bit is set to "1", and the
+"ELSE" part of the statement will be executed where each bit is "0".
+
+Predication is not very popular outside of the parallel world, because
+CPU cycles are "wasted" by having to send both the "THEN" *and* the "ELSE"
+statements through to the execution unit. Remember, though, that in
+the Libre-RISCV micro-architecture, as is described in the
+update on predication, a zero predication bit results in that element
+being **skipped**. Whilst it may sent to the ALU, once the predicate
+bit is known, the operation is **cancelled** and the Function Units
+may be allocated alternate resources. So, unlike more traditional
+Vector and SIMT micro-architectures, our design does not suffer a performance
+penalty due to predication.
+
+We do have a couple of issues to contend with, in LLVM. Firstly: whilst
+this is a variable-length vectorisation micro-architecture, LLVM itself
+does not yet support variable-length data structures. It's all based around
+fixed SIMD. There is work underway to deal with that: we can adjust
+accordingly as it happens.
+Secondly: LLVM's IR support for predication is not as feature rich as
+we would like: it's incomplete.
+
+However we have to start somewhere, and, as this is mostly software, there
+is plenty of room to improve performance as time and resources allow.
+Interestingly, AMD are planning some improvements to LLVM that will help
+us out, here. The AMDGPU has similar polymorphic registers, so there are
+plans to add in support for register "types" that have the ability to
+span (use) more than one "hardware" register. This will be fascinating
+to watch that unfold.