scons: Generate SSE2 floating-point arithmetic.
authorJosé Fonseca <jfonseca@vmware.com>
Tue, 25 Nov 2014 22:15:40 +0000 (22:15 +0000)
committerJosé Fonseca <jfonseca@vmware.com>
Wed, 26 Nov 2014 20:25:12 +0000 (20:25 +0000)
- SSE2 is available on all x86 processors we care about.

- It's recommended by Intel:

  https://software.intel.com/en-us/blogs/2012/09/26/gcc-x86-performance-hints

- And has been the default since MSVC 2012:

  http://msdn.microsoft.com/en-us/library/7t5yh4fd(v=vs.110).aspx

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
scons/gallium.py

index fe800fa0f71bcc622bbd0d9d4bbe125ae004b704..8e2090bc6786842b746e0dae9d8656a597a84cba 100755 (executable)
@@ -390,7 +390,7 @@ def generate(env):
                 ccflags += [
                     '-mstackrealign', # ensure stack is aligned
                     '-msse', '-msse2', # enable SIMD intrinsics
-                    #'-mfpmath=sse',
+                    '-mfpmath=sse', # generate SSE floating-point arithmetic
                 ]
             if platform in ['windows', 'darwin']:
                 # Workaround http://gcc.gnu.org/bugzilla/show_bug.cgi?id=37216
@@ -469,7 +469,7 @@ def generate(env):
         ]
         if env['machine'] == 'x86':
             ccflags += [
-                #'/arch:SSE2', # use the SSE2 instructions
+                '/arch:SSE2', # use the SSE2 instructions (default since MSVC 2012)
             ]
         if platform == 'windows':
             ccflags += [