# Kazan So after deciding to sponsor Jacob to work on a 3D Graphics Driver, for some reason I thought using rust would be a good idea. Normally, 3D Graphics Drivers are written in c or c++ for performance reasons, however in this case I was attracted to the security and memory-safety inherent in rust. Hilariously, it wasn't until some time last week that the way that Vulkan works actually sank in. I thought it was some sort of interpreter of a 3D API, just like gallium3d: it most definitely is not. The core of Vulkan is [SPIR-V](https://en.wikipedia.org/wiki/SPIR-V), an Intermediate Representation (IR) language based on LLVM's IR. Originally developed for OpenCL Parallel Compute, somewhere along the line someone realised that SPIR-V would also do well for representing shaders in 3D applications. So whereas previously I was deeply concerned that I had made a huge mistake in using rust, actually, the rust driver isn't so much a "driver" as it is a **compiler**. As in: the purpose of a Vulkan implementation is to **compile** the 3D shader SPIR-V binary provided by the 3D application into something that will execute directly on the underlying hardware. We chose to compile SPIR-V IR into LLVM IR, and for that task, the fact that the compiler is written in rust does **not** affect performance **in any way**. Once compiled to LLVM, the resultant IR will be handed to the standard LLVM JIT (Just-in-Time) low-level compiler, and it will execute **directly** as assembler, **and** it will execute in parallel, as well. Contrast this with gallium3d-llvmpipe where the API is **interpreted** (and also single-threaded). # Example So I asked Jacob if he could do a quick write-up of an [example translation](https://salsa.debian.org/Kazan-team/kazan/blob/master/docs/Example%20Translation%20from%20SPIR-V%20to%20LLVM%20IR.md). I wanted to see what goes on, as I quite like compilers and language translators. Also, a couple weeks back he ran into some roadblocks on how the data structures would work in the compiler, so I figured it would be nice to do a visual worked example. It looks pretty straightforward. Start at c-code, compile to SPIR-V (the writer of the 3D or OpenCL application does that part). The interesting bit is that SPIR-V kinda assumes a SIMD (or SIMT - which is basically "predicated SIMD") micro-architecture. Unlike in a standard sequential algorithm, branches are not done as "branches": they're done by testing a set of conditions (in parallel), which produces a bit-field of 1s and 0s (representing success or failure of each of the parallel compares), then the "THEN" part of the statement - bear in mind this is all parallel - will be executed on each element where its corresponding "predicate" bit is set to "1", and the "ELSE" part of the statement will be executed where each bit is "0". Predication is not very popular outside of the parallel world, because CPU cycles are "wasted" by having to send both the "THEN" *and* the "ELSE" statements through to the execution unit. Remember, though, that in the Libre-RISCV micro-architecture, as is described in the update on predication, a zero predication bit results in that element being **skipped**. Whilst it may sent to the ALU, once the predicate bit is known, the operation is **cancelled** and the Function Units may be allocated alternate resources. So, unlike more traditional Vector and SIMT micro-architectures, our design does not suffer a performance penalty due to predication. We do have a couple of issues to contend with, in LLVM. Firstly: whilst this is a variable-length vectorisation micro-architecture, LLVM itself does not yet support variable-length data structures. It's all based around fixed SIMD. There is work underway to deal with that: we can adjust accordingly as it happens. Secondly: LLVM's IR support for predication is not as feature rich as we would like: it's incomplete. However we have to start somewhere, and, as this is mostly software, there is plenty of room to improve performance as time and resources allow. Interestingly, AMD are planning some improvements to LLVM that will help us out, here. The AMDGPU has similar polymorphic registers, so there are plans to add in support for register "types" that have the ability to span (use) more than one "hardware" register. This will be fascinating to watch that unfold.