etnaviv: move nir compiler related stuff into .c file
[mesa.git] / docs / llvmpipe.rst
1 Gallium LLVMpipe Driver
2 =======================
3
4 Introduction
5 ------------
6
7 The Gallium llvmpipe driver is a software rasterizer that uses LLVM to
8 do runtime code generation. Shaders, point/line/triangle rasterization
9 and vertex processing are implemented with LLVM IR which is translated
10 to x86, x86-64, or ppc64le machine code. Also, the driver is
11 multithreaded to take advantage of multiple CPU cores (up to 8 at this
12 time). It's the fastest software rasterizer for Mesa.
13
14 Requirements
15 ------------
16
17 - For x86 or amd64 processors, 64-bit mode is recommended. Support for
18 SSE2 is strongly encouraged. Support for SSE3 and SSE4.1 will yield
19 the most efficient code. The fewer features the CPU has the more
20 likely it is that you will run into underperforming, buggy, or
21 incomplete code.
22
23 For ppc64le processors, use of the Altivec feature (the Vector
24 Facility) is recommended if supported; use of the VSX feature (the
25 Vector-Scalar Facility) is recommended if supported AND Mesa is built
26 with LLVM version 4.0 or later.
27
28 See ``/proc/cpuinfo`` to know what your CPU supports.
29
30 - Unless otherwise stated, LLVM version 3.4 is recommended; 3.3 or
31 later is required.
32
33 For Linux, on a recent Debian based distribution do:
34
35 .. code-block:: console
36
37 aptitude install llvm-dev
38
39 If you want development snapshot builds of LLVM for Debian and
40 derived distributions like Ubuntu, you can use the APT repository at
41 `apt.llvm.org <https://apt.llvm.org/>`__, which are maintained by
42 Debian's LLVM maintainer.
43
44 For a RPM-based distribution do:
45
46 .. code-block:: console
47
48 yum install llvm-devel
49
50 For Windows you will need to build LLVM from source with MSVC or
51 MINGW (either natively or through cross compilers) and CMake, and set
52 the ``LLVM`` environment variable to the directory you installed it
53 to. LLVM will be statically linked, so when building on MSVC it needs
54 to be built with a matching CRT as Mesa, and you'll need to pass
55 ``-DLLVM_USE_CRT_xxx=yyy`` as described below.
56
57
58 +-----------------+----------------------------------------------------------------+
59 | LLVM build-type | Mesa build-type |
60 | +--------------------------------+-------------------------------+
61 | | debug,checked | release,profile |
62 +=================+================================+===============================+
63 | Debug | ``-DLLVM_USE_CRT_DEBUG=MTd`` | ``-DLLVM_USE_CRT_DEBUG=MT`` |
64 +-----------------+--------------------------------+-------------------------------+
65 | Release | ``-DLLVM_USE_CRT_RELEASE=MTd`` | ``-DLLVM_USE_CRT_RELEASE=MT`` |
66 +-----------------+--------------------------------+-------------------------------+
67
68 You can build only the x86 target by passing
69 ``-DLLVM_TARGETS_TO_BUILD=X86`` to cmake.
70
71 - scons (optional)
72
73 Building
74 --------
75
76 To build everything on Linux invoke scons as:
77
78 .. code-block:: console
79
80 scons build=debug libgl-xlib
81
82 Alternatively, you can build it with meson with:
83
84 .. code-block:: console
85
86 mkdir build
87 cd build
88 meson -D glx=gallium-xlib -D gallium-drivers=swrast
89 ninja
90
91 but the rest of these instructions assume that scons is used. For
92 Windows the procedure is similar except the target:
93
94 .. code-block:: console
95
96 scons platform=windows build=debug libgl-gdi
97
98 Using
99 -----
100
101 Linux
102 ~~~~~
103
104 On Linux, building will create a drop-in alternative for ``libGL.so``
105 into
106
107 ::
108
109 build/foo/gallium/targets/libgl-xlib/libGL.so
110
111 or
112
113 ::
114
115 lib/gallium/libGL.so
116
117 To use it set the ``LD_LIBRARY_PATH`` environment variable accordingly.
118
119 For performance evaluation pass ``build=release`` to scons, and use the
120 corresponding lib directory without the ``-debug`` suffix.
121
122 Windows
123 ~~~~~~~
124
125 On Windows, building will create
126 ``build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll`` which
127 is a drop-in alternative for system's ``opengl32.dll``. To use it put it
128 in the same directory as your application. It can also be used by
129 replacing the native ICD driver, but it's quite an advanced usage, so if
130 you need to ask, don't even try it.
131
132 There is however an easy way to replace the OpenGL software renderer
133 that comes with Microsoft Windows 7 (or later) with llvmpipe (that is,
134 on systems without any OpenGL drivers):
135
136 - copy
137 ``build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll`` to
138 ``C:\Windows\SysWOW64\mesadrv.dll``
139
140 - load this registry settings:
141
142 ::
143
144 REGEDIT4
145
146 ; https://technet.microsoft.com/en-us/library/cc749368.aspx
147 ; https://www.msfn.org/board/topic/143241-portable-windows-7-build-from-winpe-30/page-5#entry942596
148 [HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows NT\CurrentVersion\OpenGLDrivers\MSOGL]
149 "DLL"="mesadrv.dll"
150 "DriverVersion"=dword:00000001
151 "Flags"=dword:00000001
152 "Version"=dword:00000002
153
154 - Ditto for 64 bits drivers if you need them.
155
156 Profiling
157 ---------
158
159 To profile llvmpipe you should build as
160
161 ::
162
163 scons build=profile <same-as-before>
164
165 This will ensure that frame pointers are used both in C and JIT
166 functions, and that no tail call optimizations are done by gcc.
167
168 Linux perf integration
169 ~~~~~~~~~~~~~~~~~~~~~~
170
171 On Linux, it is possible to have symbol resolution of JIT code with
172 `Linux perf <https://perf.wiki.kernel.org/>`__:
173
174 ::
175
176 perf record -g /my/application
177 perf report
178
179 When run inside Linux perf, llvmpipe will create a
180 ``/tmp/perf-XXXXX.map`` file with symbol address table. It also dumps
181 assembly code to ``/tmp/perf-XXXXX.map.asm``, which can be used by the
182 ``bin/perf-annotate-jit.py`` script to produce disassembly of the
183 generated code annotated with the samples.
184
185 You can obtain a call graph via
186 `Gprof2Dot <https://github.com/jrfonseca/gprof2dot#linux-perf>`__.
187
188 Unit testing
189 ------------
190
191 Building will also create several unit tests in
192 ``build/linux-???-debug/gallium/drivers/llvmpipe``:
193
194 - ``lp_test_blend``: blending
195 - ``lp_test_conv``: SIMD vector conversion
196 - ``lp_test_format``: pixel unpacking/packing
197
198 Some of these tests can output results and benchmarks to a tab-separated
199 file for later analysis, e.g.:
200
201 ::
202
203 build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv
204
205 Development Notes
206 -----------------
207
208 - When looking at this code for the first time, start in lp_state_fs.c,
209 and then skim through the ``lp_bld_*`` functions called there, and
210 the comments at the top of the ``lp_bld_*.c`` functions.
211 - The driver-independent parts of the LLVM / Gallium code are found in
212 ``src/gallium/auxiliary/gallivm/``. The filenames and function
213 prefixes need to be renamed from ``lp_bld_`` to something else
214 though.
215 - We use LLVM-C bindings for now. They are not documented, but follow
216 the C++ interfaces very closely, and appear to be complete enough for
217 code generation. See `this stand-alone
218 example <https://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html>`__.
219 See the ``llvm-c/Core.h`` file for reference.
220
221 .. _recommended_reading:
222
223 Recommended Reading
224 -------------------
225
226 - Rasterization
227
228 - `Triangle Scan Conversion using 2D Homogeneous
229 Coordinates <https://www.cs.unc.edu/~olano/papers/2dh-tri/>`__
230 - `Rasterization on
231 Larrabee <http://www.drdobbs.com/parallel/rasterization-on-larrabee/217200602>`__
232 (`DevMaster
233 copy <http://devmaster.net/posts/2887/rasterization-on-larrabee>`__)
234 - `Rasterization using half-space
235 functions <http://devmaster.net/posts/6133/rasterization-using-half-space-functions>`__
236 - `Advanced
237 Rasterization <http://devmaster.net/posts/6145/advanced-rasterization>`__
238 - `Optimizing Software Occlusion
239 Culling <https://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/>`__
240
241 - Texture sampling
242
243 - `Perspective Texture
244 Mapping <http://chrishecker.com/Miscellaneous_Technical_Articles#Perspective_Texture_Mapping>`__
245 - `Texturing As In
246 Unreal <https://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml>`__
247 - `Run-Time MIP-Map
248 Filtering <http://www.gamasutra.com/view/feature/3301/runtime_mipmap_filtering.php>`__
249 - `Will "brilinear" filtering
250 persist? <http://alt.3dcenter.org/artikel/2003/10-26_a_english.php>`__
251 - `Trilinear
252 filtering <http://ixbtlabs.com/articles2/gffx/nv40-rx800-3.html>`__
253 - `Texture
254 Swizzling <http://devmaster.net/posts/12785/texture-swizzling>`__
255
256 - SIMD
257
258 - `Whole-Function
259 Vectorization <http://www.cdl.uni-saarland.de/projects/wfv/#header4>`__
260
261 - Optimization
262
263 - `Optimizing Pixomatic For Modern x86
264 Processors <http://www.drdobbs.com/optimizing-pixomatic-for-modern-x86-proc/184405807>`__
265 - `Intel 64 and IA-32 Architectures Optimization Reference
266 Manual <http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html>`__
267 - `Software optimization
268 resources <http://www.agner.org/optimize/>`__
269 - `Intel Intrinsics
270 Guide <https://software.intel.com/en-us/articles/intel-intrinsics-guide>`__
271
272 - LLVM
273
274 - `LLVM Language Reference
275 Manual <http://llvm.org/docs/LangRef.html>`__
276 - `The secret of LLVM C
277 bindings <https://npcontemplation.blogspot.co.uk/2008/06/secret-of-llvm-c-bindings.html>`__
278
279 - General
280
281 - `A trip through the Graphics
282 Pipeline <https://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/>`__
283 - `WARP Architecture and
284 Performance <https://msdn.microsoft.com/en-us/library/gg615082.aspx#architecture>`__