+2008-05-23 Uros Bizjak <ubizjak@gmail.com>
+ Jakub Jelinek <jakub@redhat.com>
+
+ PR target/36079
+ * configure.ac: Handle --enable-cld.
+ * configure: Regenerated.
+ * config.gcc: Add USE_IX86_CLD to tm_defines for x86 targets.
+ * config/i386/i386.h (struct machine_function): Add needs_cld field.
+ (ix86_current_function_needs_cld): New define.
+ * config/i386/i386.md (UNSPEC_CLD): New unspec volatile constant.
+ (cld): New isns pattern.
+ (strmov_singleop, rep_mov, strset_singleop, rep_stos, cmpstrnqi_nz_1,
+ cmpstrnqi_1, strlenqi_1): Set ix86_current_function_needs_cld flag.
+ * config/i386/i386.opt (mcld): New option.
+ * config/i386/i386.c (ix86_expand_prologue): Emit cld insn if
+ TARGET_CLD and ix86_current_function_needs_cld.
+ (override_options): Use -mcld by default for 32-bit code if
+ USE_IX86_CLD.
+
+ * doc/install.texi (Options specification): Document --enable-cld.
+ * doc/invoke.texi (Machine Dependent Options)
+ [i386 and x86-64 Options]: Add -mcld option.
+ (Intel 386 and AMD x86-64 Options): Document -mcld option.
+
2008-05-23 Kai Tietz <kai.tietz@onevison.com>
* config/i386/i386.c (return_in_memory_32): Add ATTRIBUTE_UNUSED.
(return_in_memory_64): Likewise.
(vector_alignment_reachable_p): Likewise.
* tree-vect-transform.c (vectorizable_load): Likewise.
* tree-vectorizer.c (vect_supportable_dr_alignment): Likewise.
-
- * tree-vectorizer.c (get_vectype_for_scalar_type): Pass mode of
- scalar_type to UNITS_PER_SIMD_WORD.
+ (get_vectype_for_scalar_type): Pass mode of scalar_type
+ to UNITS_PER_SIMD_WORD.
* config/arm/arm.h (UNITS_PER_SIMD_WORD): Updated.
* config/i386/i386.h (UNITS_PER_SIMD_WORD): Likewise.
2008-05-20 David Daney <ddaney@avtrex.com>
* config/mips/mips.md (UNSPEC_SYNC_NEW_OP_12,
- UNSPEC_SYNC_OLD_OP_12,
- UNSPEC_SYNC_EXCHANGE_12): New define_constants.
- (UNSPEC_SYNC_EXCHANGE, UNSPEC_MEMORY_BARRIER,
- UNSPEC_SET_GOT_VERSION,
+ UNSPEC_SYNC_OLD_OP_12, UNSPEC_SYNC_EXCHANGE_12): New define_constants.
+ (UNSPEC_SYNC_EXCHANGE, UNSPEC_MEMORY_BARRIER, UNSPEC_SET_GOT_VERSION,
UNSPEC_UPDATE_GOT_VERSION): Renumber.
(optab, insn): Add 'plus' and 'minus' to define_code_attr.
(atomic_hiqi_op): New define_code_iterator.
- (sync_compare_and_swap<mode>): Call
- mips_expand_atomic_qihi instead of
+ (sync_compare_and_swap<mode>): Call mips_expand_atomic_qihi instead of
mips_expand_compare_and_swap_12.
(compare_and_swap_12): Use MIPS_COMPARE_AND_SWAP_12 instead of
- MIPS_COMPARE_AND_SWAP_12_0. Pass argument to
- MIPS_COMPARE_AND_SWAP_12.
+ MIPS_COMPARE_AND_SWAP_12_0. Pass argument to MIPS_COMPARE_AND_SWAP_12.
(sync_<optab><mode>, sync_old_<optab><mode>,
sync_new_<optab><mode>, sync_nand<mode>, sync_old_nand<mode>,
- sync_new_nand<mode>): New define_expands for HI and QI mode
- operands.
+ sync_new_nand<mode>): New define_expands for HI and QI mode operands.
(sync_<optab>_12, sync_old_<optab>_12, sync_new_<optab>_12,
sync_nand_12, sync_old_nand_12, sync_new_nand_12): New insns.
- (sync_lock_test_and_set<mode>): New define_expand for HI and QI
- modes.
+ (sync_lock_test_and_set<mode>): New define_expand for HI and QI modes.
(test_and_set_12): New insn.
(sync_old_add<mode>, sync_new_add<mode>, sync_old_<optab><mode>,
sync_new_<optab><mode>, sync_old_nand<mode>,
2008-05-20 Jan Sjodin <jan.sjodin@amd.com>
Sebastian Pop <sebastian.pop@amd.com>
- * tree-loop-linear.c (gather_interchange_stats): Look in the access matrix,
- and never look at the tree representation of the memory accesses.
+ * tree-loop-linear.c (gather_interchange_stats): Look in the access
+ matrix, and never look at the tree representation of the memory
+ accesses.
(linear_transform_loops): Computes parameters and access matrices.
- * tree-data-ref.c (compute_data_dependences_for_loop): Returns false when fails.
+ * tree-data-ref.c (compute_data_dependences_for_loop): Returns false
+ when fails.
(access_matrix_get_index_for_parameter): New.
* tree-data-ref.h (struct access_matrix): New.
(AM_LOOP_NEST_NUM, AM_NB_INDUCTION_VARS, AM_PARAMETERS, AM_MATRIX,
PR tree-optimization/36206
* tree-chrec.h (chrec_fold_op): New.
- * tree-data-ref.c (initialize_matrix_A): Traverse NOP_EXPR, PLUS_EXPR, and
- other trees.
+ * tree-data-ref.c (initialize_matrix_A): Traverse NOP_EXPR, PLUS_EXPR,
+ and other trees.
2008-05-20 Nathan Sidwell <nathan@codesourcery.com>
* c-incpath.c (INO_T_EQ): Do not define on non-inode systems.
(DIRS_EQ): New.
- (remove_duplicates): Do not set inode on non-inode systems. Use
- DIRS_EQ.
+ (remove_duplicates): Do not set inode on non-inode systems.
+ Use DIRS_EQ.
2008-05-20 Sandra Loosemore <sandra@codesourcery.com>
2008-05-20 Richard Guenther <rguenther@suse.de>
- * tree-ssa-reassoc.c (fini_reassoc): Use the statistics
- infrastructure.
+ * tree-ssa-reassoc.c (fini_reassoc): Use the statistics infrastructure.
* tree-ssa-sccvn.c (process_scc): Likewise.
* tree-ssa-sink.c (execute_sink_code): Likewise.
* tree-ssa-threadupdate.c (thread_through_all_blocks): Likewise.
fi
case ${target} in
+i[34567]86-*-*)
+ if test $enable_cld = yes; then
+ tm_defines="${tm_defines} USE_IX86_CLD=1"
+ fi
+ ;;
x86_64-*-*)
tm_file="i386/biarch64.h ${tm_file}"
+ if test $enable_cld = yes; then
+ tm_defines="${tm_defines} USE_IX86_CLD=1"
+ fi
;;
esac
can be optimized to ap = __builtin_next_arg (0). */
if (!TARGET_64BIT || TARGET_64BIT_MS_ABI)
targetm.expand_builtin_va_start = NULL;
+
+#ifdef USE_IX86_CLD
+ /* Use -mcld by default for 32-bit code if configured with --enable-cld. */
+ if (!TARGET_64BIT)
+ target_flags |= MASK_CLD & ~target_flags_explicit;
+#endif
}
\f
/* Return true if this goes in large data/bss. */
emit_insn (gen_prologue_use (pic_offset_table_rtx));
emit_insn (gen_blockage ());
}
+
+ /* Emit cld instruction if stringops are used in the function. */
+ if (TARGET_CLD && ix86_current_function_needs_cld)
+ emit_insn (gen_cld ());
}
/* Emit code to restore saved registers using MOV insns. First register
int save_varrargs_registers;
int accesses_prev_frame;
int optimize_mode_switching[MAX_386_ENTITIES];
- /* Set by ix86_compute_frame_layout and used by prologue/epilogue expander to
- determine the style used. */
+ int needs_cld;
+ /* Set by ix86_compute_frame_layout and used by prologue/epilogue
+ expander to determine the style used. */
int use_fast_prologue_epilogue;
/* Number of saved registers USE_FAST_PROLOGUE_EPILOGUE has been computed
for. */
#define ix86_stack_locals (cfun->machine->stack_locals)
#define ix86_save_varrargs_registers (cfun->machine->save_varrargs_registers)
#define ix86_optimize_mode_switching (cfun->machine->optimize_mode_switching)
+#define ix86_current_function_needs_cld (cfun->machine->needs_cld)
#define ix86_tls_descriptor_calls_expanded_in_cfun \
(cfun->machine->tls_descriptor_call_expanded_p)
/* Since tls_descriptor_call_expanded is not cleared, even if all TLS
(UNSPECV_XCHG 12)
(UNSPECV_LOCK 13)
(UNSPECV_PROLOGUE_USE 14)
+ (UNSPECV_CLD 15)
])
;; Constants to represent pcomtrue/pcomfalse variants
\f
;; Block operation instructions
+(define_insn "cld"
+ [(unspec_volatile [(const_int 0)] UNSPECV_CLD)]
+ ""
+ "cld"
+ [(set_attr "length" "1")
+ (set_attr "length_immediate" "0")
+ (set_attr "modrm" "0")])
+
(define_expand "movmemsi"
[(use (match_operand:BLK 0 "memory_operand" ""))
(use (match_operand:BLK 1 "memory_operand" ""))
(set (match_operand 2 "register_operand" "")
(match_operand 5 "" ""))])]
"TARGET_SINGLE_STRINGOP || optimize_size"
- "")
+ "ix86_current_function_needs_cld = 1;")
(define_insn "*strmovdi_rex_1"
[(set (mem:DI (match_operand:DI 2 "register_operand" "0"))
(match_operand 3 "memory_operand" ""))
(use (match_dup 4))])]
""
- "")
+ "ix86_current_function_needs_cld = 1;")
(define_insn "*rep_movdi_rex64"
[(set (match_operand:DI 2 "register_operand" "=c") (const_int 0))
(set (match_operand 0 "register_operand" "")
(match_operand 3 "" ""))])]
"TARGET_SINGLE_STRINGOP || optimize_size"
- "")
+ "ix86_current_function_needs_cld = 1;")
(define_insn "*strsetdi_rex_1"
[(set (mem:DI (match_operand:DI 1 "register_operand" "0"))
(use (match_operand 3 "register_operand" ""))
(use (match_dup 1))])]
""
- "")
+ "ix86_current_function_needs_cld = 1;")
(define_insn "*rep_stosdi_rex64"
[(set (match_operand:DI 1 "register_operand" "=c") (const_int 0))
(clobber (match_operand 1 "register_operand" ""))
(clobber (match_dup 2))])]
""
- "")
+ "ix86_current_function_needs_cld = 1;")
(define_insn "*cmpstrnqi_nz_1"
[(set (reg:CC FLAGS_REG)
(clobber (match_operand 1 "register_operand" ""))
(clobber (match_dup 2))])]
""
- "")
+ "ix86_current_function_needs_cld = 1;")
(define_insn "*cmpstrnqi_1"
[(set (reg:CC FLAGS_REG)
(clobber (match_operand 1 "register_operand" ""))
(clobber (reg:CC FLAGS_REG))])]
""
- "")
+ "ix86_current_function_needs_cld = 1;")
(define_insn "*strlenqi_1"
[(set (match_operand:SI 0 "register_operand" "=&c")
;; Instruction support
+mcld
+Target Report Mask(CLD)
+Generate cld instruction in the function prologue.
+
mabm
Target Report RejectNegative Var(x86_abm)
Support code generation of Advanced Bit Manipulation (ABM) instructions.
--enable-sjlj-exceptions
arrange to use setjmp/longjmp exception handling
--enable-secureplt enable -msecure-plt by default for PowerPC
+ --enable-cld enable -mcld by default for 32bit x86
--disable-win32-registry
disable lookup of installation paths in the
Registry on Windows hosts
fi;
+# Check whether --enable-cld or --disable-cld was given.
+if test "${enable_cld+set}" = set; then
+ enableval="$enable_cld"
+
+else
+ enable_cld=no
+fi;
+
# Windows32 Registry support for specifying GCC installation paths.
# Check whether --enable-win32-registry or --disable-win32-registry was given.
if test "${enable_win32_registry+set}" = set; then
[ --enable-secureplt enable -msecure-plt by default for PowerPC],
[], [])
+AC_ARG_ENABLE(cld,
+[ --enable-cld enable -mcld by default for 32bit x86], [],
+[enable_cld=no])
+
# Windows32 Registry support for specifying GCC installation paths.
AC_ARG_ENABLE(win32-registry,
[ --disable-win32-registry
See ``RS/6000 and PowerPC Options'' in the main manual
@end ifhtml
+@item --enable-cld
+This option enables @option{-mcld} by default for 32-bit x86 targets.
+@ifnothtml
+@xref{i386 and x86-64 Options,, i386 and x86-64 Options, gcc,
+Using the GNU Compiler Collection (GCC)},
+@end ifnothtml
+@ifhtml
+See ``i386 and x86-64 Options'' in the main manual
+@end ifhtml
+
@item --enable-win32-registry
@itemx --enable-win32-registry=@var{key}
@itemx --disable-win32-registry
-masm=@var{dialect} -mno-fancy-math-387 @gol
-mno-fp-ret-in-387 -msoft-float @gol
-mno-wide-multiply -mrtd -malign-double @gol
--mpreferred-stack-boundary=@var{num} -mcx16 -msahf -mrecip @gol
+-mpreferred-stack-boundary=@var{num} -mcld -mcx16 -msahf -mrecip @gol
-mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 @gol
-maes -mpclmul @gol
-msse4a -m3dnow -mpopcnt -mabm -msse5 @gol
the file containing the CPU detection code should be compiled without
these options.
+@item -mcld
+@opindex mcld
+This option instructs GCC to emit a @code{cld} instruction in the prologue
+of functions that use string instructions. String instructions depend on
+the DF flag to select between autoincrement or autodecrement mode. While the
+ABI specifies the DF flag to be cleared on function entry, some operating
+systems violate this specification by not clearing the DF flag in their
+exception dispatchers. The exception handler can be invoked with the DF flag
+set which leads to wrong direction mode, when string instructions are used.
+This option can be enabled by default on 32-bit x86 targets by configuring
+GCC with the @option{--enable-cld} configure option. Generation of @code{cld}
+instructions can be suppressed with the @option{-mno-cld} compiler option
+in this case.
+
@item -mcx16
@opindex mcx16
This option will enable GCC to use CMPXCHG16B instruction in generated code.