LLVMPIPE -- a fork of softpipe that employs LLVM for code generation. Status ====== Done so far is: - the whole fragment pipeline is code generated in a single function - input interpolation - depth testing - texture sampling (not all state/formats are supported) - fragment shader TGSI translation - same level of support as the TGSI SSE2 exec machine, with the exception we don't fallback to TGSI interpretation when an unsupported opcode is found, but just ignore it - done in SoA layout - input interpolation also code generated - alpha testing - blend (including logic ops) - both in SoA and AoS layouts, but only the former used for now - code is generic - intermediates can be vectors of floats, ubytes, fixed point, etc, and of any width and length - not all operations are implemented for these types yet though Most mesa/progs/demos/* work. To do (probably by this order): - code generate stipple and stencil testing - translate the remaining bits of texture sampling state - translate TGSI control flow instructions, and all other remaining opcodes - integrate with the draw module for VS code generation - code generate the triangle setup and rasterization Requirements ============ - Linux - A x86 or amd64 processor. 64bit mode is preferred. Support for sse2 is strongly encouraged. Support for ssse3, and sse4.1 will yield the most efficient code. The less features the CPU has the more likely is that you ran into underperforming, buggy, or incomplete code. See /proc/cpuinfo to know what your CPU supports. - LLVM 2.6. For Linux, on a recent Debian based distribution do: aptitude install llvm-dev For Windows download pre-built MSVC 9.0 or MinGW binaries from http://people.freedesktop.org/~jrfonseca/llvm/ and set the LLVM environment variable to the extracted path. - scons (optional) - udis86, http://udis86.sourceforge.net/ (optional): git clone git://udis86.git.sourceforge.net/gitroot/udis86/udis86 cd udis86 ./autogen.sh ./configure --with-pic make sudo make install Building ======== To build everything on Linux invoke scons as: scons debug=yes statetrackers=mesa drivers=trace,llvmpipe winsys=xlib dri=false Alternatively, you can build it with GNU make, if you prefer, by invoking it as make linux-llvm but the rest of these instructions assume that scons is used. For windows is everything the except except the winsys: scons debug=yes statetrackers=mesa drivers=trace,llvmpipe winsys=gdi dri=false Using ===== On Linux, building will create a drop-in alternative for libGL.so. To use it set the environment variables: export LD_LIBRARY_PATH=$PWD/build/linux-x86_64-debug/lib:$LD_LIBRARY_PATH or export LD_LIBRARY_PATH=$PWD/build/linux-x86-debug/lib:$LD_LIBRARY_PATH For performance evaluation pass debug=no to scons, and use the corresponding lib directory without the "-debug" suffix. On Windows, building will create a drop-in alternative for opengl32.dll. To use it put it in the same directory as the application. It can also be used by replacing the native ICD driver, but it's quite an advanced usage, so if you need to ask, don't even try it. Unit testing ============ Building will also create several unit tests in build/linux-???-debug/gallium/drivers/llvmpipe: - lp_test_blend: blending - lp_test_conv: SIMD vector conversion - lp_test_format: pixel unpacking/packing Some of this tests can output results and benchmarks to a tab-separated-file for posterior analysis, e.g.: build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv Development Notes ================= - When looking to this code by the first time start in lp_state_fs.c, and then skim through the lp_bld_* functions called in there, and the comments at the top of the lp_bld_*.c functions. - All lp_bld_*.[ch] are isolated from the rest of the driver, and could/may be put in a stand-alone Gallium state -> LLVM IR translation module. - We use LLVM-C bindings for now. They are not documented, but follow the C++ interfaces very closely, and appear to be complete enough for code generation. See http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html for a stand-alone example.