From: Uros Bizjak Date: Fri, 23 May 2008 07:53:16 +0000 (+0200) Subject: re PR target/36079 (cld instruction is not emitted anymore.) X-Git-Url: https://git.libre-soc.org/?a=commitdiff_plain;h=922e3e33b211d1f01056457b093c8196aff40333;p=gcc.git re PR target/36079 (cld instruction is not emitted anymore.) PR target/36079 * configure.ac: Handle --enable-cld. * configure: Regenerated. * config.gcc: Add USE_IX86_CLD to tm_defines for x86 targets. * config/i386/i386.h (struct machine_function): Add needs_cld field. (ix86_current_function_needs_cld): New define. * config/i386/i386.md (UNSPEC_CLD): New unspec volatile constant. (cld): New isns pattern. (strmov_singleop, rep_mov, strset_singleop, rep_stos, cmpstrnqi_nz_1, cmpstrnqi_1, strlenqi_1): Set ix86_current_function_needs_cld flag. * config/i386/i386.opt (mcld): New option. * config/i386/i386.c (ix86_expand_prologue): Emit cld insn if TARGET_CLD and ix86_current_function_needs_cld. (override_options): Use -mcld by default for 32-bit code if USE_IX86_CLD. * doc/install.texi (Options specification): Document --enable-cld. * doc/invoke.texi (Machine Dependent Options) [i386 and x86-64 Options]: Add -mcld option. (Intel 386 and AMD x86-64 Options): Document -mcld option. From-SVN: r135792 --- diff --git a/gcc/ChangeLog b/gcc/ChangeLog index a1d7e28fb25..48e9ae28e5c 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,27 @@ +2008-05-23 Uros Bizjak + Jakub Jelinek + + PR target/36079 + * configure.ac: Handle --enable-cld. + * configure: Regenerated. + * config.gcc: Add USE_IX86_CLD to tm_defines for x86 targets. + * config/i386/i386.h (struct machine_function): Add needs_cld field. + (ix86_current_function_needs_cld): New define. + * config/i386/i386.md (UNSPEC_CLD): New unspec volatile constant. + (cld): New isns pattern. + (strmov_singleop, rep_mov, strset_singleop, rep_stos, cmpstrnqi_nz_1, + cmpstrnqi_1, strlenqi_1): Set ix86_current_function_needs_cld flag. + * config/i386/i386.opt (mcld): New option. + * config/i386/i386.c (ix86_expand_prologue): Emit cld insn if + TARGET_CLD and ix86_current_function_needs_cld. + (override_options): Use -mcld by default for 32-bit code if + USE_IX86_CLD. + + * doc/install.texi (Options specification): Document --enable-cld. + * doc/invoke.texi (Machine Dependent Options) + [i386 and x86-64 Options]: Add -mcld option. + (Intel 386 and AMD x86-64 Options): Document -mcld option. + 2008-05-23 Kai Tietz * config/i386/i386.c (return_in_memory_32): Add ATTRIBUTE_UNUSED. (return_in_memory_64): Likewise. @@ -58,9 +82,8 @@ (vector_alignment_reachable_p): Likewise. * tree-vect-transform.c (vectorizable_load): Likewise. * tree-vectorizer.c (vect_supportable_dr_alignment): Likewise. - - * tree-vectorizer.c (get_vectype_for_scalar_type): Pass mode of - scalar_type to UNITS_PER_SIMD_WORD. + (get_vectype_for_scalar_type): Pass mode of scalar_type + to UNITS_PER_SIMD_WORD. * config/arm/arm.h (UNITS_PER_SIMD_WORD): Updated. * config/i386/i386.h (UNITS_PER_SIMD_WORD): Likewise. @@ -206,27 +229,21 @@ 2008-05-20 David Daney * config/mips/mips.md (UNSPEC_SYNC_NEW_OP_12, - UNSPEC_SYNC_OLD_OP_12, - UNSPEC_SYNC_EXCHANGE_12): New define_constants. - (UNSPEC_SYNC_EXCHANGE, UNSPEC_MEMORY_BARRIER, - UNSPEC_SET_GOT_VERSION, + UNSPEC_SYNC_OLD_OP_12, UNSPEC_SYNC_EXCHANGE_12): New define_constants. + (UNSPEC_SYNC_EXCHANGE, UNSPEC_MEMORY_BARRIER, UNSPEC_SET_GOT_VERSION, UNSPEC_UPDATE_GOT_VERSION): Renumber. (optab, insn): Add 'plus' and 'minus' to define_code_attr. (atomic_hiqi_op): New define_code_iterator. - (sync_compare_and_swap): Call - mips_expand_atomic_qihi instead of + (sync_compare_and_swap): Call mips_expand_atomic_qihi instead of mips_expand_compare_and_swap_12. (compare_and_swap_12): Use MIPS_COMPARE_AND_SWAP_12 instead of - MIPS_COMPARE_AND_SWAP_12_0. Pass argument to - MIPS_COMPARE_AND_SWAP_12. + MIPS_COMPARE_AND_SWAP_12_0. Pass argument to MIPS_COMPARE_AND_SWAP_12. (sync_, sync_old_, sync_new_, sync_nand, sync_old_nand, - sync_new_nand): New define_expands for HI and QI mode - operands. + sync_new_nand): New define_expands for HI and QI mode operands. (sync__12, sync_old__12, sync_new__12, sync_nand_12, sync_old_nand_12, sync_new_nand_12): New insns. - (sync_lock_test_and_set): New define_expand for HI and QI - modes. + (sync_lock_test_and_set): New define_expand for HI and QI modes. (test_and_set_12): New insn. (sync_old_add, sync_new_add, sync_old_, sync_new_, sync_old_nand, @@ -284,10 +301,12 @@ 2008-05-20 Jan Sjodin Sebastian Pop - * tree-loop-linear.c (gather_interchange_stats): Look in the access matrix, - and never look at the tree representation of the memory accesses. + * tree-loop-linear.c (gather_interchange_stats): Look in the access + matrix, and never look at the tree representation of the memory + accesses. (linear_transform_loops): Computes parameters and access matrices. - * tree-data-ref.c (compute_data_dependences_for_loop): Returns false when fails. + * tree-data-ref.c (compute_data_dependences_for_loop): Returns false + when fails. (access_matrix_get_index_for_parameter): New. * tree-data-ref.h (struct access_matrix): New. (AM_LOOP_NEST_NUM, AM_NB_INDUCTION_VARS, AM_PARAMETERS, AM_MATRIX, @@ -333,15 +352,15 @@ PR tree-optimization/36206 * tree-chrec.h (chrec_fold_op): New. - * tree-data-ref.c (initialize_matrix_A): Traverse NOP_EXPR, PLUS_EXPR, and - other trees. + * tree-data-ref.c (initialize_matrix_A): Traverse NOP_EXPR, PLUS_EXPR, + and other trees. 2008-05-20 Nathan Sidwell * c-incpath.c (INO_T_EQ): Do not define on non-inode systems. (DIRS_EQ): New. - (remove_duplicates): Do not set inode on non-inode systems. Use - DIRS_EQ. + (remove_duplicates): Do not set inode on non-inode systems. + Use DIRS_EQ. 2008-05-20 Sandra Loosemore @@ -349,8 +368,7 @@ 2008-05-20 Richard Guenther - * tree-ssa-reassoc.c (fini_reassoc): Use the statistics - infrastructure. + * tree-ssa-reassoc.c (fini_reassoc): Use the statistics infrastructure. * tree-ssa-sccvn.c (process_scc): Likewise. * tree-ssa-sink.c (execute_sink_code): Likewise. * tree-ssa-threadupdate.c (thread_through_all_blocks): Likewise. diff --git a/gcc/config.gcc b/gcc/config.gcc index fa73333ac83..efc3c4a84ef 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -397,8 +397,16 @@ then fi case ${target} in +i[34567]86-*-*) + if test $enable_cld = yes; then + tm_defines="${tm_defines} USE_IX86_CLD=1" + fi + ;; x86_64-*-*) tm_file="i386/biarch64.h ${tm_file}" + if test $enable_cld = yes; then + tm_defines="${tm_defines} USE_IX86_CLD=1" + fi ;; esac diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 31a691ff38b..0f140c8adf7 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -2764,6 +2764,12 @@ override_options (void) can be optimized to ap = __builtin_next_arg (0). */ if (!TARGET_64BIT || TARGET_64BIT_MS_ABI) targetm.expand_builtin_va_start = NULL; + +#ifdef USE_IX86_CLD + /* Use -mcld by default for 32-bit code if configured with --enable-cld. */ + if (!TARGET_64BIT) + target_flags |= MASK_CLD & ~target_flags_explicit; +#endif } /* Return true if this goes in large data/bss. */ @@ -6597,6 +6603,10 @@ ix86_expand_prologue (void) emit_insn (gen_prologue_use (pic_offset_table_rtx)); emit_insn (gen_blockage ()); } + + /* Emit cld instruction if stringops are used in the function. */ + if (TARGET_CLD && ix86_current_function_needs_cld) + emit_insn (gen_cld ()); } /* Emit code to restore saved registers using MOV insns. First register diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 72ead0795c2..0b5ca139350 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -2432,8 +2432,9 @@ struct machine_function GTY(()) int save_varrargs_registers; int accesses_prev_frame; int optimize_mode_switching[MAX_386_ENTITIES]; - /* Set by ix86_compute_frame_layout and used by prologue/epilogue expander to - determine the style used. */ + int needs_cld; + /* Set by ix86_compute_frame_layout and used by prologue/epilogue + expander to determine the style used. */ int use_fast_prologue_epilogue; /* Number of saved registers USE_FAST_PROLOGUE_EPILOGUE has been computed for. */ @@ -2453,6 +2454,7 @@ struct machine_function GTY(()) #define ix86_stack_locals (cfun->machine->stack_locals) #define ix86_save_varrargs_registers (cfun->machine->save_varrargs_registers) #define ix86_optimize_mode_switching (cfun->machine->optimize_mode_switching) +#define ix86_current_function_needs_cld (cfun->machine->needs_cld) #define ix86_tls_descriptor_calls_expanded_in_cfun \ (cfun->machine->tls_descriptor_call_expanded_p) /* Since tls_descriptor_call_expanded is not cleared, even if all TLS diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index a021e7c75e7..8f91de08519 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -213,6 +213,7 @@ (UNSPECV_XCHG 12) (UNSPECV_LOCK 13) (UNSPECV_PROLOGUE_USE 14) + (UNSPECV_CLD 15) ]) ;; Constants to represent pcomtrue/pcomfalse variants @@ -18374,6 +18375,14 @@ ;; Block operation instructions +(define_insn "cld" + [(unspec_volatile [(const_int 0)] UNSPECV_CLD)] + "" + "cld" + [(set_attr "length" "1") + (set_attr "length_immediate" "0") + (set_attr "modrm" "0")]) + (define_expand "movmemsi" [(use (match_operand:BLK 0 "memory_operand" "")) (use (match_operand:BLK 1 "memory_operand" "")) @@ -18446,7 +18455,7 @@ (set (match_operand 2 "register_operand" "") (match_operand 5 "" ""))])] "TARGET_SINGLE_STRINGOP || optimize_size" - "") + "ix86_current_function_needs_cld = 1;") (define_insn "*strmovdi_rex_1" [(set (mem:DI (match_operand:DI 2 "register_operand" "0")) @@ -18563,7 +18572,7 @@ (match_operand 3 "memory_operand" "")) (use (match_dup 4))])] "" - "") + "ix86_current_function_needs_cld = 1;") (define_insn "*rep_movdi_rex64" [(set (match_operand:DI 2 "register_operand" "=c") (const_int 0)) @@ -18723,7 +18732,7 @@ (set (match_operand 0 "register_operand" "") (match_operand 3 "" ""))])] "TARGET_SINGLE_STRINGOP || optimize_size" - "") + "ix86_current_function_needs_cld = 1;") (define_insn "*strsetdi_rex_1" [(set (mem:DI (match_operand:DI 1 "register_operand" "0")) @@ -18817,7 +18826,7 @@ (use (match_operand 3 "register_operand" "")) (use (match_dup 1))])] "" - "") + "ix86_current_function_needs_cld = 1;") (define_insn "*rep_stosdi_rex64" [(set (match_operand:DI 1 "register_operand" "=c") (const_int 0)) @@ -18993,7 +19002,7 @@ (clobber (match_operand 1 "register_operand" "")) (clobber (match_dup 2))])] "" - "") + "ix86_current_function_needs_cld = 1;") (define_insn "*cmpstrnqi_nz_1" [(set (reg:CC FLAGS_REG) @@ -19040,7 +19049,7 @@ (clobber (match_operand 1 "register_operand" "")) (clobber (match_dup 2))])] "" - "") + "ix86_current_function_needs_cld = 1;") (define_insn "*cmpstrnqi_1" [(set (reg:CC FLAGS_REG) @@ -19109,7 +19118,7 @@ (clobber (match_operand 1 "register_operand" "")) (clobber (reg:CC FLAGS_REG))])] "" - "") + "ix86_current_function_needs_cld = 1;") (define_insn "*strlenqi_1" [(set (match_operand:SI 0 "register_operand" "=&c") diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index 45af24acac4..75c94ba771e 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -250,6 +250,10 @@ Support SSE5 built-in functions and code generation ;; Instruction support +mcld +Target Report Mask(CLD) +Generate cld instruction in the function prologue. + mabm Target Report RejectNegative Var(x86_abm) Support code generation of Advanced Bit Manipulation (ABM) instructions. diff --git a/gcc/configure b/gcc/configure index 9f8cf5f19b9..b2ab9a71988 100755 --- a/gcc/configure +++ b/gcc/configure @@ -1046,6 +1046,7 @@ Optional Features: --enable-sjlj-exceptions arrange to use setjmp/longjmp exception handling --enable-secureplt enable -msecure-plt by default for PowerPC + --enable-cld enable -mcld by default for 32bit x86 --disable-win32-registry disable lookup of installation paths in the Registry on Windows hosts @@ -13709,6 +13710,14 @@ if test "${enable_secureplt+set}" = set; then fi; +# Check whether --enable-cld or --disable-cld was given. +if test "${enable_cld+set}" = set; then + enableval="$enable_cld" + +else + enable_cld=no +fi; + # Windows32 Registry support for specifying GCC installation paths. # Check whether --enable-win32-registry or --disable-win32-registry was given. if test "${enable_win32_registry+set}" = set; then diff --git a/gcc/configure.ac b/gcc/configure.ac index 001ff503aaf..3ac7ff53d86 100644 --- a/gcc/configure.ac +++ b/gcc/configure.ac @@ -1528,6 +1528,10 @@ AC_ARG_ENABLE(secureplt, [ --enable-secureplt enable -msecure-plt by default for PowerPC], [], []) +AC_ARG_ENABLE(cld, +[ --enable-cld enable -mcld by default for 32bit x86], [], +[enable_cld=no]) + # Windows32 Registry support for specifying GCC installation paths. AC_ARG_ENABLE(win32-registry, [ --disable-win32-registry diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi index 0391ec86b7d..3d098a49ba8 100644 --- a/gcc/doc/install.texi +++ b/gcc/doc/install.texi @@ -1214,6 +1214,16 @@ Using the GNU Compiler Collection (GCC)}, See ``RS/6000 and PowerPC Options'' in the main manual @end ifhtml +@item --enable-cld +This option enables @option{-mcld} by default for 32-bit x86 targets. +@ifnothtml +@xref{i386 and x86-64 Options,, i386 and x86-64 Options, gcc, +Using the GNU Compiler Collection (GCC)}, +@end ifnothtml +@ifhtml +See ``i386 and x86-64 Options'' in the main manual +@end ifhtml + @item --enable-win32-registry @itemx --enable-win32-registry=@var{key} @itemx --disable-win32-registry diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 061311fb260..4ef73c1cd1e 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -553,7 +553,7 @@ Objective-C and Objective-C++ Dialects}. -masm=@var{dialect} -mno-fancy-math-387 @gol -mno-fp-ret-in-387 -msoft-float @gol -mno-wide-multiply -mrtd -malign-double @gol --mpreferred-stack-boundary=@var{num} -mcx16 -msahf -mrecip @gol +-mpreferred-stack-boundary=@var{num} -mcld -mcx16 -msahf -mrecip @gol -mmmx -msse -msse2 -msse3 -mssse3 -msse4.1 -msse4.2 -msse4 @gol -maes -mpclmul @gol -msse4a -m3dnow -mpopcnt -mabm -msse5 @gol @@ -10814,6 +10814,20 @@ supported architecture, using the appropriate flags. In particular, the file containing the CPU detection code should be compiled without these options. +@item -mcld +@opindex mcld +This option instructs GCC to emit a @code{cld} instruction in the prologue +of functions that use string instructions. String instructions depend on +the DF flag to select between autoincrement or autodecrement mode. While the +ABI specifies the DF flag to be cleared on function entry, some operating +systems violate this specification by not clearing the DF flag in their +exception dispatchers. The exception handler can be invoked with the DF flag +set which leads to wrong direction mode, when string instructions are used. +This option can be enabled by default on 32-bit x86 targets by configuring +GCC with the @option{--enable-cld} configure option. Generation of @code{cld} +instructions can be suppressed with the @option{-mno-cld} compiler option +in this case. + @item -mcx16 @opindex mcx16 This option will enable GCC to use CMPXCHG16B instruction in generated code.