docs/llvmpipe: Add one other good reference.
[mesa.git] / docs / llvmpipe.html
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
2 <html lang="en">
3 <head>
4 <meta http-equiv="content-type" content="text/html; charset=utf-8">
5 <title>llvmpipe</title>
6 <link rel="stylesheet" type="text/css" href="mesa.css">
7 </head>
8 <body>
9
10 <div class="header">
11 <h1>The Mesa 3D Graphics Library</h1>
12 </div>
13
14 <iframe src="contents.html"></iframe>
15 <div class="content">
16
17 <h1>Introduction</h1>
18
19 <p>
20 The Gallium llvmpipe driver is a software rasterizer that uses LLVM to
21 do runtime code generation.
22 Shaders, point/line/triangle rasterization and vertex processing are
23 implemented with LLVM IR which is translated to x86 or x86-64 machine
24 code.
25 Also, the driver is multithreaded to take advantage of multiple CPU cores
26 (up to 8 at this time).
27 It's the fastest software rasterizer for Mesa.
28 </p>
29
30
31 <h1>Requirements</h1>
32
33 <ul>
34 <li>
35 <p>An x86 or amd64 processor; 64-bit mode recommended.</p>
36 <p>
37 Support for SSE2 is strongly encouraged. Support for SSSE3 and SSE4.1 will
38 yield the most efficient code. The fewer features the CPU has the more
39 likely is that you run into underperforming, buggy, or incomplete code.
40 </p>
41 <p>
42 See /proc/cpuinfo to know what your CPU supports.
43 </p>
44 </li>
45 <li>
46 <p>LLVM: version 2.9 recommended; 2.6 or later required.</p>
47 <p><b>NOTE</b>: LLVM 2.8 and earlier will not work on systems that support the
48 Intel AVX extensions (e.g. Sandybridge). LLVM's code generator will
49 fail when trying to emit AVX instructions. This was fixed in LLVM 2.9.
50 </p>
51 <p>
52 For Linux, on a recent Debian based distribution do:
53 </p>
54 <pre>
55 aptitude install llvm-dev
56 </pre>
57 <p>
58 For a RPM-based distribution do:
59 </p>
60 <pre>
61 yum install llvm-devel
62 </pre>
63
64 <p>
65 For Windows you will need to build LLVM from source with MSVC or MINGW
66 (either natively or through cross compilers) and CMake, and set the LLVM
67 environment variable to the directory you installed it to.
68
69 LLVM will be statically linked, so when building on MSVC it needs to be
70 built with a matching CRT as Mesa, and you'll need to pass
71 -DLLVM_USE_CRT_RELEASE=MTd for debug and checked builds,
72 -DLLVM_USE_CRT_RELEASE=MTd for profile and release builds.
73
74 You can build only the x86 target by passing -DLLVM_TARGETS_TO_BUILD=X86
75 to cmake.
76 </p>
77 </li>
78
79 <li>
80 <p>scons (optional)</p>
81 </li>
82 </ul>
83
84
85 <h1>Building</h1>
86
87 To build everything on Linux invoke scons as:
88
89 <pre>
90 scons build=debug libgl-xlib
91 </pre>
92
93 Alternatively, you can build it with GNU make, if you prefer, by invoking it as
94
95 <pre>
96 make linux-llvm
97 </pre>
98
99 but the rest of these instructions assume that scons is used.
100
101 For Windows the procedure is similar except the target:
102
103 <pre>
104 scons build=debug libgl-gdi
105 </pre>
106
107
108 <h1>Using</h1>
109
110 On Linux, building will create a drop-in alternative for libGL.so into
111
112 <pre>
113 build/foo/gallium/targets/libgl-xlib/libGL.so
114 </pre>
115 or
116 <pre>
117 lib/gallium/libGL.so
118 </pre>
119
120 To use it set the LD_LIBRARY_PATH environment variable accordingly.
121
122 For performance evaluation pass debug=no to scons, and use the corresponding
123 lib directory without the "-debug" suffix.
124
125 On Windows, building will create a drop-in alternative for opengl32.dll. To use
126 it put it in the same directory as the application. It can also be used by
127 replacing the native ICD driver, but it's quite an advanced usage, so if you
128 need to ask, don't even try it.
129
130
131 <h1>Profiling</h1>
132
133 <p>
134 To profile llvmpipe you should build as
135 </p>
136 <pre>
137 scons build=profile &lt;same-as-before&gt;
138 </pre>
139
140 <p>
141 This will ensure that frame pointers are used both in C and JIT functions, and
142 that no tail call optimizations are done by gcc.
143 </p>
144
145 <h2>Linux perf integration</h2>
146
147 <p>
148 On Linux, it is possible to have symbol resolution of JIT code with <a href="http://perf.wiki.kernel.org/">Linux perf</a>:
149 </p>
150
151 <pre>
152 perf record -g /my/application
153 perf report
154 </pre>
155
156 <p>
157 When run inside Linux perf, llvmpipe will create a /tmp/perf-XXXXX.map file with
158 symbol address table. It also dumps assembly code to /tmp/perf-XXXXX.map.asm,
159 which can be used by the bin/perf-annotate-jit script to produce disassembly of
160 the generated code annotated with the samples.
161 </p>
162
163 <p>You can obtain a call graph via
164 <a href="http://code.google.com/p/jrfonseca/wiki/Gprof2Dot#linux_perf">Gprof2Dot</a>.</p>
165
166
167 <h1>Unit testing</h1>
168
169 <p>
170 Building will also create several unit tests in
171 build/linux-???-debug/gallium/drivers/llvmpipe:
172 </p>
173
174 <ul>
175 <li> lp_test_blend: blending
176 <li> lp_test_conv: SIMD vector conversion
177 <li> lp_test_format: pixel unpacking/packing
178 </ul>
179
180 <p>
181 Some of this tests can output results and benchmarks to a tab-separated-file
182 for posterior analysis, e.g.:
183 </p>
184 <pre>
185 build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv
186 </pre>
187
188
189 <h1>Development Notes</h1>
190
191 <ul>
192 <li>
193 When looking to this code by the first time start in lp_state_fs.c, and
194 then skim through the lp_bld_* functions called in there, and the comments
195 at the top of the lp_bld_*.c functions.
196 </li>
197 <li>
198 The driver-independent parts of the LLVM / Gallium code are found in
199 src/gallium/auxiliary/gallivm/. The filenames and function prefixes
200 need to be renamed from "lp_bld_" to something else though.
201 </li>
202 <li>
203 We use LLVM-C bindings for now. They are not documented, but follow the C++
204 interfaces very closely, and appear to be complete enough for code
205 generation. See
206 <a href="http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html">
207 this stand-alone example</a>. See the llvm-c/Core.h file for reference.
208 </li>
209 </ul>
210
211 <h1 id="recommended_reading">Recommended Reading</h1>
212
213 <ul>
214 <li>
215 <p>Rasterization</p>
216 <ul>
217 <li><a href="http://www.cs.unc.edu/~olano/papers/2dh-tri/">Triangle Scan Conversion using 2D Homogeneous Coordinates</a></li>
218 <li><a href="http://www.drdobbs.com/parallel/rasterization-on-larrabee/217200602">Rasterization on Larrabee</a> (<a href="http://devmaster.net/posts/2887/rasterization-on-larrabee">DevMaster copy</a>)</li>
219 <li><a href="http://devmaster.net/posts/6133/rasterization-using-half-space-functions">Rasterization using half-space functions</a></li>
220 <li><a href="http://devmaster.net/posts/6145/advanced-rasterization">Advanced Rasterization</a></li>
221 <li><a href="http://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/">Optimizing Software Occlusion Culling</a></li>
222 </ul>
223 </li>
224 <li>
225 <p>Texture sampling</p>
226 <ul>
227 <li><a href="http://chrishecker.com/Miscellaneous_Technical_Articles#Perspective_Texture_Mapping">Perspective Texture Mapping</a></li>
228 <li><a href="http://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml">Texturing As In Unreal</a></li>
229 <li><a href="http://www.gamasutra.com/view/feature/3301/runtime_mipmap_filtering.php">Run-Time MIP-Map Filtering</a></li>
230 <li><a href="http://alt.3dcenter.org/artikel/2003/10-26_a_english.php">Will "brilinear" filtering persist?</a></li>
231 <li><a href="http://ixbtlabs.com/articles2/gffx/nv40-rx800-3.html">Trilinear filtering</a></li>
232 <li><a href="http://devmaster.net/posts/12785/texture-swizzling">Texture Swizzling</a></li>
233 </ul>
234 </li>
235 <li>
236 <p>SIMD</p>
237 <ul>
238 <li><a href="http://www.cdl.uni-saarland.de/projects/wfv/#header4">Whole-Function Vectorization</a></li>
239 </ul>
240 </li>
241 <li>
242 <p>Optimization</p>
243 <ul>
244 <li><a href="http://www.drdobbs.com/optimizing-pixomatic-for-modern-x86-proc/184405807">Optimizing Pixomatic For Modern x86 Processors</a></li>
245 <li><a href="http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html">Intel 64 and IA-32 Architectures Optimization Reference Manual</a></li>
246 <li><a href="http://www.agner.org/optimize/">Software optimization resources</a></li>
247 <li><a href="http://software.intel.com/en-us/articles/intel-intrinsics-guide">Intel Intrinsics Guide</a><li>
248 </ul>
249 </li>
250 <li>
251 <p>LLVM</p>
252 <ul>
253 <li><a href="http://llvm.org/docs/LangRef.html">LLVM Language Reference Manual</a></li>
254 <li><a href="http://npcontemplation.blogspot.co.uk/2008/06/secret-of-llvm-c-bindings.html">The secret of LLVM C bindings</a></li>
255 </ul>
256 </li>
257 <li>
258 <p>General</p>
259 <ul>
260 <li><a href="http://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/">A trip through the Graphics Pipeline</a></li>
261 <li><a href="http://msdn.microsoft.com/en-us/library/gg615082.aspx#architecture">WARP Architecture and Performance</a></li>
262 </ul>
263 </li>
264 </ul>
265
266 </div>
267 </body>
268 </html>