+2017-08-01 Jakub Jelinek <jakub@redhat.com>
+
+ PR target/80846
+ * optabs.def (vec_extract_optab, vec_init_optab): Change from
+ a direct optab to conversion optab.
+ * optabs.c (expand_vector_broadcast): Use convert_optab_handler
+ with GET_MODE_INNER as last argument instead of optab_handler.
+ * expmed.c (extract_bit_field_1): Likewise. Use vector from
+ vector extraction if possible and optab is available.
+ * expr.c (store_constructor): Use convert_optab_handler instead
+ of optab_handler. Use vector initialization from smaller
+ vectors if possible and optab is available.
+ * tree-vect-stmts.c (vectorizable_load): Likewise.
+ * doc/md.texi (vec_extract, vec_init): Document that the optabs
+ now have two modes.
+ * config/i386/i386.c (ix86_expand_vector_init): Handle expansion
+ of vec_init from half-sized vectors with the same element mode.
+ * config/i386/sse.md (ssehalfvecmode): Add V4TI case.
+ (ssehalfvecmodelower, ssescalarmodelower): New mode attributes.
+ (reduc_plus_scal_v8df, reduc_plus_scal_v4df, reduc_plus_scal_v2df,
+ reduc_plus_scal_v16sf, reduc_plus_scal_v8sf, reduc_plus_scal_v4sf,
+ reduc_<code>_scal_<mode>, reduc_umin_scal_v8hi): Add element mode
+ after mode in gen_vec_extract* calls.
+ (vec_extract<mode>): Renamed to ...
+ (vec_extract<mode><ssescalarmodelower>): ... this.
+ (vec_extract<mode><ssehalfvecmodelower>): New expander.
+ (rotl<mode>3, rotr<mode>3, <shift_insn><mode>3, ashrv2di3): Add
+ element mode after mode in gen_vec_init* calls.
+ (VEC_INIT_HALF_MODE): New mode iterator.
+ (vec_init<mode>): Renamed to ...
+ (vec_init<mode><ssescalarmodelower>): ... this.
+ (vec_init<mode><ssehalfvecmodelower>): New expander.
+ * config/i386/mmx.md (vec_extractv2sf): Renamed to ...
+ (vec_extractv2sfsf): ... this.
+ (vec_initv2sf): Renamed to ...
+ (vec_initv2sfsf): ... this.
+ (vec_extractv2si): Renamed to ...
+ (vec_extractv2sisi): ... this.
+ (vec_initv2si): Renamed to ...
+ (vec_initv2sisi): ... this.
+ (vec_extractv4hi): Renamed to ...
+ (vec_extractv4hihi): ... this.
+ (vec_initv4hi): Renamed to ...
+ (vec_initv4hihi): ... this.
+ (vec_extractv8qi): Renamed to ...
+ (vec_extractv8qiqi): ... this.
+ (vec_initv8qi): Renamed to ...
+ (vec_initv8qiqi): ... this.
+ * config/rs6000/vector.md (VEC_base_l): New mode attribute.
+ (vec_init<mode>): Renamed to ...
+ (vec_init<mode><VEC_base_l>): ... this.
+ (vec_extract<mode>): Renamed to ...
+ (vec_extract<mode><VEC_base_l>): ... this.
+ * config/rs6000/paired.md (vec_initv2sf): Renamed to ...
+ (vec_initv2sfsf): ... this.
+ * config/rs6000/altivec.md (splitter, altivec_copysign_v4sf3,
+ vec_unpacku_hi_v16qi, vec_unpacku_hi_v8hi, vec_unpacku_lo_v16qi,
+ vec_unpacku_lo_v8hi, mulv16qi3, altivec_vreve<mode>2): Add
+ element mode after mode in gen_vec_init* calls.
+ * config/aarch64/aarch64-simd.md (vec_init<mode>): Renamed to ...
+ (vec_init<mode><Vel>): ... this.
+ (vec_extract<mode>): Renamed to ...
+ (vec_extract<mode><Vel>): ... this.
+ * config/aarch64/iterators.md (Vel): New mode attribute.
+ * config/s390/s390.c (s390_expand_vec_strlen, s390_expand_vec_movstr):
+ Add element mode after mode in gen_vec_extract* calls.
+ * config/s390/vector.md (non_vec_l): New mode attribute.
+ (vec_extract<mode>): Renamed to ...
+ (vec_extract<mode><non_vec_l>): ... this.
+ (vec_init<mode>): Renamed to ...
+ (vec_init<mode><non_vec_l>): ... this.
+ * config/s390/s390-builtins.def (s390_vlgvb, s390_vlgvh, s390_vlgvf,
+ s390_vlgvf_flt, s390_vlgvg, s390_vlgvg_dbl): Add element mode after
+ vec_extract mode.
+ * config/arm/iterators.md (V_elem_l): New mode attribute.
+ * config/arm/neon.md (vec_extract<mode>): Renamed to ...
+ (vec_extract<mode><V_elem_l>): ... this.
+ (vec_extractv2di): Renamed to ...
+ (vec_extractv2didi): ... this.
+ (vec_init<mode>): Renamed to ...
+ (vec_init<mode><V_elem_l>): ... this.
+ (reduc_plus_scal_<mode>, reduc_plus_scal_v2di, reduc_smin_scal_<mode>,
+ reduc_smax_scal_<mode>, reduc_umin_scal_<mode>,
+ reduc_umax_scal_<mode>, neon_vget_lane<mode>, neon_vget_laneu<mode>):
+ Add element mode after gen_vec_extract* calls.
+ * config/mips/mips-msa.md (vec_init<mode>): Renamed to ...
+ (vec_init<mode><unitmode>): ... this.
+ (vec_extract<mode>): Renamed to ...
+ (vec_extract<mode><unitmode>): ... this.
+ * config/mips/loongson.md (vec_init<mode>): Renamed to ...
+ (vec_init<mode><unitmode>): ... this.
+ * config/mips/mips-ps-3d.md (vec_initv2sf): Renamed to ...
+ (vec_initv2sfsf): ... this.
+ (vec_extractv2sf): Renamed to ...
+ (vec_extractv2sfsf): ... this.
+ (reduc_plus_scal_v2sf, reduc_smin_scal_v2sf, reduc_smax_scal_v2sf):
+ Add element mode after gen_vec_extract* calls.
+ * config/mips/mips.md (unitmode): New mode iterator.
+ * config/spu/spu.c (spu_expand_prologue, spu_allocate_stack,
+ spu_builtin_extract): Add element mode after gen_vec_extract* calls.
+ * config/spu/spu.md (inner_l): New mode attribute.
+ (vec_init<mode>): Renamed to ...
+ (vec_init<mode><inner_l>): ... this.
+ (vec_extract<mode>): Renamed to ...
+ (vec_extract<mode><inner_l>): ... this.
+ * config/sparc/sparc.md (veltmode): New mode iterator.
+ (vec_init<VMALL:mode>): Renamed to ...
+ (vec_init<VMALL:mode><VMALL:veltmode>): ... this.
+ * config/ia64/vect.md (vec_initv2si): Renamed to ...
+ (vec_initv2sisi): ... this.
+ (vec_initv2sf): Renamed to ...
+ (vec_initv2sfsf): ... this.
+ (vec_extractv2sf): Renamed to ...
+ (vec_extractv2sfsf): ... this.
+ * config/powerpcspe/vector.md (VEC_base_l): New mode attribute.
+ (vec_init<mode>): Renamed to ...
+ (vec_init<mode><VEC_base_l>): ... this.
+ (vec_extract<mode>): Renamed to ...
+ (vec_extract<mode><VEC_base_l>): ... this.
+ * config/powerpcspe/paired.md (vec_initv2sf): Renamed to ...
+ (vec_initv2sfsf): ... this.
+ * config/powerpcspe/altivec.md (splitter, altivec_copysign_v4sf3,
+ vec_unpacku_hi_v16qi, vec_unpacku_hi_v8hi, vec_unpacku_lo_v16qi,
+ vec_unpacku_lo_v8hi, mulv16qi3): Add element mode after mode in
+ gen_vec_init* calls.
+
2017-08-01 Richard Biener <rguenther@suse.de>
PR tree-optimization/81297
DONE;
})
-;; Standard pattern name vec_init<mode>.
+;; Standard pattern name vec_init<mode><Vel>.
-(define_expand "vec_init<mode>"
+(define_expand "vec_init<mode><Vel>"
[(match_operand:VALL_F16 0 "register_operand" "")
(match_operand 1 "" "")]
"TARGET_SIMD"
"urecpe\\t%0.<Vtype>, %1.<Vtype>"
[(set_attr "type" "neon_fp_recpe_<Vetype><q>")])
-;; Standard pattern name vec_extract<mode>.
+;; Standard pattern name vec_extract<mode><Vel>.
-(define_expand "vec_extract<mode>"
+(define_expand "vec_extract<mode><Vel>"
[(match_operand:<VEL> 0 "aarch64_simd_nonimmediate_operand" "")
(match_operand:VALL_F16 1 "register_operand" "")
(match_operand:SI 2 "immediate_operand" "")]
(SI "SI") (HI "HI")
(QI "QI")])
+;; Define element mode for each vector mode (lower case).
+(define_mode_attr Vel [(V8QI "qi") (V16QI "qi")
+ (V4HI "hi") (V8HI "hi")
+ (V2SI "si") (V4SI "si")
+ (DI "di") (V2DI "di")
+ (V4HF "hf") (V8HF "hf")
+ (V2SF "sf") (V4SF "sf")
+ (V2DF "df") (DF "df")
+ (SI "si") (HI "hi")
+ (QI "qi")])
+
;; 64-bit container modes the inner or scalar source mode.
(define_mode_attr VCOND [(HI "V4HI") (SI "V2SI")
(V4HI "V4HI") (V8HI "V4HI")
(V2SF "SF") (V4SF "SF")
(DI "DI") (V2DI "DI")])
+;; As above but in lower case.
+(define_mode_attr V_elem_l [(V8QI "qi") (V16QI "qi")
+ (V4HI "hi") (V8HI "hi")
+ (V4HF "hf") (V8HF "hf")
+ (V2SI "si") (V4SI "si")
+ (V2SF "sf") (V4SF "sf")
+ (DI "di") (V2DI "di")])
+
;; Element modes for vector extraction, padded up to register size.
(define_mode_attr V_ext [(V8QI "SI") (V16QI "SI")
DONE;
})
-(define_insn "vec_extract<mode>"
+(define_insn "vec_extract<mode><V_elem_l>"
[(set (match_operand:<V_elem> 0 "nonimmediate_operand" "=Um,r")
(vec_select:<V_elem>
(match_operand:VD_LANE 1 "s_register_operand" "w,w")
[(set_attr "type" "neon_store1_one_lane<q>,neon_to_gp<q>")]
)
-(define_insn "vec_extract<mode>"
+(define_insn "vec_extract<mode><V_elem_l>"
[(set (match_operand:<V_elem> 0 "nonimmediate_operand" "=Um,r")
(vec_select:<V_elem>
(match_operand:VQ2 1 "s_register_operand" "w,w")
[(set_attr "type" "neon_store1_one_lane<q>,neon_to_gp<q>")]
)
-(define_insn "vec_extractv2di"
+(define_insn "vec_extractv2didi"
[(set (match_operand:DI 0 "nonimmediate_operand" "=Um,r")
(vec_select:DI
(match_operand:V2DI 1 "s_register_operand" "w,w")
[(set_attr "type" "neon_store1_one_lane_q,neon_to_gp_q")]
)
-(define_expand "vec_init<mode>"
+(define_expand "vec_init<mode><V_elem_l>"
[(match_operand:VDQ 0 "s_register_operand" "")
(match_operand 1 "" "")]
"TARGET_NEON"
neon_pairwise_reduce (vec, operands[1], <MODE>mode,
&gen_neon_vpadd_internal<mode>);
/* The same result is actually computed into every element. */
- emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
+ emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
DONE;
})
rtx vec = gen_reg_rtx (V2DImode);
emit_insn (gen_arm_reduc_plus_internal_v2di (vec, operands[1]));
- emit_insn (gen_vec_extractv2di (operands[0], vec, const0_rtx));
+ emit_insn (gen_vec_extractv2didi (operands[0], vec, const0_rtx));
DONE;
})
neon_pairwise_reduce (vec, operands[1], <MODE>mode,
&gen_neon_vpsmin<mode>);
/* The result is computed into every element of the vector. */
- emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
+ emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
DONE;
})
neon_pairwise_reduce (vec, operands[1], <MODE>mode,
&gen_neon_vpsmax<mode>);
/* The result is computed into every element of the vector. */
- emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
+ emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
DONE;
})
neon_pairwise_reduce (vec, operands[1], <MODE>mode,
&gen_neon_vpumin<mode>);
/* The result is computed into every element of the vector. */
- emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
+ emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
DONE;
})
neon_pairwise_reduce (vec, operands[1], <MODE>mode,
&gen_neon_vpumax<mode>);
/* The result is computed into every element of the vector. */
- emit_insn (gen_vec_extract<mode> (operands[0], vec, const0_rtx));
+ emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], vec, const0_rtx));
DONE;
})
}
if (GET_MODE_UNIT_BITSIZE (<MODE>mode) == 32)
- emit_insn (gen_vec_extract<mode> (operands[0], operands[1], operands[2]));
+ emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], operands[1],
+ operands[2]));
else
emit_insn (gen_neon_vget_lane<mode>_sext_internal (operands[0],
operands[1],
}
if (GET_MODE_UNIT_BITSIZE (<MODE>mode) == 32)
- emit_insn (gen_vec_extract<mode> (operands[0], operands[1], operands[2]));
+ emit_insn (gen_vec_extract<mode><V_elem_l> (operands[0], operands[1],
+ operands[2]));
else
emit_insn (gen_neon_vget_lane<mode>_zext_internal (operands[0],
operands[1],
int i;
rtx x;
+ /* Handle first initialization from vector elts. */
+ if (n_elts != XVECLEN (vals, 0))
+ {
+ rtx subtarget = target;
+ x = XVECEXP (vals, 0, 0);
+ gcc_assert (GET_MODE_INNER (GET_MODE (x)) == inner_mode);
+ if (GET_MODE_NUNITS (GET_MODE (x)) * 2 == n_elts)
+ {
+ rtx ops[2] = { XVECEXP (vals, 0, 0), XVECEXP (vals, 0, 1) };
+ if (inner_mode == QImode || inner_mode == HImode)
+ {
+ mode = mode_for_vector (SImode,
+ n_elts * GET_MODE_SIZE (inner_mode) / 4);
+ inner_mode
+ = mode_for_vector (SImode,
+ n_elts * GET_MODE_SIZE (inner_mode) / 8);
+ ops[0] = gen_lowpart (inner_mode, ops[0]);
+ ops[1] = gen_lowpart (inner_mode, ops[1]);
+ subtarget = gen_reg_rtx (mode);
+ }
+ ix86_expand_vector_init_concat (mode, subtarget, ops, 2);
+ if (subtarget != target)
+ emit_move_insn (target, gen_lowpart (GET_MODE (target), subtarget));
+ return;
+ }
+ gcc_unreachable ();
+ }
+
for (i = 0; i < n_elts; ++i)
{
x = XVECEXP (vals, 0, i);
[(set (match_dup 0) (match_dup 1))]
"operands[1] = adjust_address (operands[1], SFmode, 4);")
-(define_expand "vec_extractv2sf"
+(define_expand "vec_extractv2sfsf"
[(match_operand:SF 0 "register_operand")
(match_operand:V2SF 1 "register_operand")
(match_operand 2 "const_int_operand")]
DONE;
})
-(define_expand "vec_initv2sf"
+(define_expand "vec_initv2sfsf"
[(match_operand:V2SF 0 "register_operand")
(match_operand 1)]
"TARGET_SSE"
operands[1] = adjust_address (operands[1], SImode, INTVAL (operands[2]) * 4);
})
-(define_expand "vec_extractv2si"
+(define_expand "vec_extractv2sisi"
[(match_operand:SI 0 "register_operand")
(match_operand:V2SI 1 "register_operand")
(match_operand 2 "const_int_operand")]
DONE;
})
-(define_expand "vec_initv2si"
+(define_expand "vec_initv2sisi"
[(match_operand:V2SI 0 "register_operand")
(match_operand 1)]
"TARGET_SSE"
DONE;
})
-(define_expand "vec_extractv4hi"
+(define_expand "vec_extractv4hihi"
[(match_operand:HI 0 "register_operand")
(match_operand:V4HI 1 "register_operand")
(match_operand 2 "const_int_operand")]
DONE;
})
-(define_expand "vec_initv4hi"
+(define_expand "vec_initv4hihi"
[(match_operand:V4HI 0 "register_operand")
(match_operand 1)]
"TARGET_SSE"
DONE;
})
-(define_expand "vec_extractv8qi"
+(define_expand "vec_extractv8qiqi"
[(match_operand:QI 0 "register_operand")
(match_operand:V8QI 1 "register_operand")
(match_operand 2 "const_int_operand")]
DONE;
})
-(define_expand "vec_initv8qi"
+(define_expand "vec_initv8qiqi"
[(match_operand:V8QI 0 "register_operand")
(match_operand 1)]
"TARGET_SSE"
;; Mapping of vector modes to a vector mode of half size
(define_mode_attr ssehalfvecmode
- [(V64QI "V32QI") (V32HI "V16HI") (V16SI "V8SI") (V8DI "V4DI")
+ [(V64QI "V32QI") (V32HI "V16HI") (V16SI "V8SI") (V8DI "V4DI") (V4TI "V2TI")
(V32QI "V16QI") (V16HI "V8HI") (V8SI "V4SI") (V4DI "V2DI")
(V16QI "V8QI") (V8HI "V4HI") (V4SI "V2SI")
(V16SF "V8SF") (V8DF "V4DF")
(V8SF "V4SF") (V4DF "V2DF")
(V4SF "V2SF")])
+(define_mode_attr ssehalfvecmodelower
+ [(V64QI "v32qi") (V32HI "v16hi") (V16SI "v8si") (V8DI "v4di") (V4TI "v2ti")
+ (V32QI "v16qi") (V16HI "v8hi") (V8SI "v4si") (V4DI "v2di")
+ (V16QI "v8qi") (V8HI "v4hi") (V4SI "v2si")
+ (V16SF "v8sf") (V8DF "v4df")
+ (V8SF "v4sf") (V4DF "v2df")
+ (V4SF "v2sf")])
+
;; Mapping of vector modes ti packed single mode of the same size
(define_mode_attr ssePSmode
[(V16SI "V16SF") (V8DF "V16SF")
(V8DF "DF") (V4DF "DF") (V2DF "DF")
(V4TI "TI") (V2TI "TI")])
+;; Mapping of vector modes back to the scalar modes
+(define_mode_attr ssescalarmodelower
+ [(V64QI "qi") (V32QI "qi") (V16QI "qi")
+ (V32HI "hi") (V16HI "hi") (V8HI "hi")
+ (V16SI "si") (V8SI "si") (V4SI "si")
+ (V8DI "di") (V4DI "di") (V2DI "di")
+ (V16SF "sf") (V8SF "sf") (V4SF "sf")
+ (V8DF "df") (V4DF "df") (V2DF "df")
+ (V4TI "ti") (V2TI "ti")])
+
;; Mapping of vector modes to the 128bit modes
(define_mode_attr ssexmmmode
[(V64QI "V16QI") (V32QI "V16QI") (V16QI "V16QI")
{
rtx tmp = gen_reg_rtx (V8DFmode);
ix86_expand_reduc (gen_addv8df3, tmp, operands[1]);
- emit_insn (gen_vec_extractv8df (operands[0], tmp, const0_rtx));
+ emit_insn (gen_vec_extractv8dfdf (operands[0], tmp, const0_rtx));
DONE;
})
emit_insn (gen_avx_haddv4df3 (tmp, operands[1], operands[1]));
emit_insn (gen_avx_vperm2f128v4df3 (tmp2, tmp, tmp, GEN_INT (1)));
emit_insn (gen_addv4df3 (vec_res, tmp, tmp2));
- emit_insn (gen_vec_extractv4df (operands[0], vec_res, const0_rtx));
+ emit_insn (gen_vec_extractv4dfdf (operands[0], vec_res, const0_rtx));
DONE;
})
{
rtx tmp = gen_reg_rtx (V2DFmode);
emit_insn (gen_sse3_haddv2df3 (tmp, operands[1], operands[1]));
- emit_insn (gen_vec_extractv2df (operands[0], tmp, const0_rtx));
+ emit_insn (gen_vec_extractv2dfdf (operands[0], tmp, const0_rtx));
DONE;
})
{
rtx tmp = gen_reg_rtx (V16SFmode);
ix86_expand_reduc (gen_addv16sf3, tmp, operands[1]);
- emit_insn (gen_vec_extractv16sf (operands[0], tmp, const0_rtx));
+ emit_insn (gen_vec_extractv16sfsf (operands[0], tmp, const0_rtx));
DONE;
})
emit_insn (gen_avx_haddv8sf3 (tmp2, tmp, tmp));
emit_insn (gen_avx_vperm2f128v8sf3 (tmp, tmp2, tmp2, GEN_INT (1)));
emit_insn (gen_addv8sf3 (vec_res, tmp, tmp2));
- emit_insn (gen_vec_extractv8sf (operands[0], vec_res, const0_rtx));
+ emit_insn (gen_vec_extractv8sfsf (operands[0], vec_res, const0_rtx));
DONE;
})
}
else
ix86_expand_reduc (gen_addv4sf3, vec_res, operands[1]);
- emit_insn (gen_vec_extractv4sf (operands[0], vec_res, const0_rtx));
+ emit_insn (gen_vec_extractv4sfsf (operands[0], vec_res, const0_rtx));
DONE;
})
{
rtx tmp = gen_reg_rtx (<MODE>mode);
ix86_expand_reduc (gen_<code><mode>3, tmp, operands[1]);
- emit_insn (gen_vec_extract<mode> (operands[0], tmp, const0_rtx));
+ emit_insn (gen_vec_extract<mode><ssescalarmodelower> (operands[0], tmp,
+ const0_rtx));
DONE;
})
{
rtx tmp = gen_reg_rtx (<MODE>mode);
ix86_expand_reduc (gen_<code><mode>3, tmp, operands[1]);
- emit_insn (gen_vec_extract<mode> (operands[0], tmp, const0_rtx));
+ emit_insn (gen_vec_extract<mode><ssescalarmodelower> (operands[0], tmp,
+ const0_rtx));
DONE;
})
{
rtx tmp = gen_reg_rtx (<MODE>mode);
ix86_expand_reduc (gen_<code><mode>3, tmp, operands[1]);
- emit_insn (gen_vec_extract<mode> (operands[0], tmp, const0_rtx));
+ emit_insn (gen_vec_extract<mode><ssescalarmodelower> (operands[0], tmp,
+ const0_rtx));
DONE;
})
{
rtx tmp = gen_reg_rtx (V8HImode);
ix86_expand_reduc (gen_uminv8hi3, tmp, operands[1]);
- emit_insn (gen_vec_extractv8hi (operands[0], tmp, const0_rtx));
+ emit_insn (gen_vec_extractv8hihi (operands[0], tmp, const0_rtx));
DONE;
})
(V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX") V2DF
(V4TI "TARGET_AVX512F") (V2TI "TARGET_AVX")])
-(define_expand "vec_extract<mode>"
+(define_expand "vec_extract<mode><ssescalarmodelower>"
[(match_operand:<ssescalarmode> 0 "register_operand")
(match_operand:VEC_EXTRACT_MODE 1 "register_operand")
(match_operand 2 "const_int_operand")]
DONE;
})
+(define_expand "vec_extract<mode><ssehalfvecmodelower>"
+ [(match_operand:<ssehalfvecmode> 0 "nonimmediate_operand")
+ (match_operand:V_512 1 "register_operand")
+ (match_operand 2 "const_0_to_1_operand")]
+ "TARGET_AVX512F"
+{
+ if (INTVAL (operands[2]))
+ emit_insn (gen_vec_extract_hi_<mode> (operands[0], operands[1]));
+ else
+ emit_insn (gen_vec_extract_lo_<mode> (operands[0], operands[1]));
+ DONE;
+})
+
;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
;;
;; Parallel double-precision floating point element swizzling
for (i = 0; i < <ssescalarnum>; i++)
RTVEC_ELT (vs, i) = op2;
- emit_insn (gen_vec_init<mode> (reg, par));
+ emit_insn (gen_vec_init<mode><ssescalarmodelower> (reg, par));
emit_insn (gen_xop_vrotl<mode>3 (operands[0], operands[1], reg));
DONE;
}
for (i = 0; i < <ssescalarnum>; i++)
RTVEC_ELT (vs, i) = op2;
- emit_insn (gen_vec_init<mode> (reg, par));
+ emit_insn (gen_vec_init<mode><ssescalarmodelower> (reg, par));
emit_insn (gen_neg<mode>2 (neg, reg));
emit_insn (gen_xop_vrotl<mode>3 (operands[0], operands[1], neg));
DONE;
XVECEXP (par, 0, i) = operands[2];
tmp = gen_reg_rtx (V16QImode);
- emit_insn (gen_vec_initv16qi (tmp, par));
+ emit_insn (gen_vec_initv16qiqi (tmp, par));
if (negate)
emit_insn (gen_negv16qi2 (tmp, tmp));
for (i = 0; i < 2; i++)
XVECEXP (par, 0, i) = operands[2];
- emit_insn (gen_vec_initv2di (reg, par));
+ emit_insn (gen_vec_initv2didi (reg, par));
if (negate)
emit_insn (gen_negv2di2 (reg, reg));
<ssehalfvecmode>mode);
})
-;; Modes handled by vec_init patterns.
+;; Modes handled by vec_init expanders.
(define_mode_iterator VEC_INIT_MODE
[(V64QI "TARGET_AVX512F") (V32QI "TARGET_AVX") V16QI
(V32HI "TARGET_AVX512F") (V16HI "TARGET_AVX") V8HI
(V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX") (V2DF "TARGET_SSE2")
(V4TI "TARGET_AVX512F") (V2TI "TARGET_AVX")])
-(define_expand "vec_init<mode>"
+;; Likewise, but for initialization from half sized vectors.
+;; Thus, these are all VEC_INIT_MODE modes except V2??.
+(define_mode_iterator VEC_INIT_HALF_MODE
+ [(V64QI "TARGET_AVX512F") (V32QI "TARGET_AVX") V16QI
+ (V32HI "TARGET_AVX512F") (V16HI "TARGET_AVX") V8HI
+ (V16SI "TARGET_AVX512F") (V8SI "TARGET_AVX") V4SI
+ (V8DI "TARGET_AVX512F") (V4DI "TARGET_AVX")
+ (V16SF "TARGET_AVX512F") (V8SF "TARGET_AVX") V4SF
+ (V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX")
+ (V4TI "TARGET_AVX512F")])
+
+(define_expand "vec_init<mode><ssescalarmodelower>"
[(match_operand:VEC_INIT_MODE 0 "register_operand")
(match_operand 1)]
"TARGET_SSE"
DONE;
})
+(define_expand "vec_init<mode><ssehalfvecmodelower>"
+ [(match_operand:VEC_INIT_HALF_MODE 0 "register_operand")
+ (match_operand 1)]
+ "TARGET_SSE"
+{
+ ix86_expand_vector_init (false, operands[0], operands[1]);
+ DONE;
+})
+
(define_insn "<avx2_avx512>_ashrv<mode><mask_name>"
[(set (match_operand:VI48_AVX512F_AVX512VL 0 "register_operand" "=v")
(ashiftrt:VI48_AVX512F_AVX512VL
}
[(set_attr "itanium_class" "mmshf")])
-(define_expand "vec_initv2si"
+(define_expand "vec_initv2sisi"
[(match_operand:V2SI 0 "gr_register_operand" "")
(match_operand 1 "" "")]
""
"fselect %0 = %F2, %F3, %1"
[(set_attr "itanium_class" "fmisc")])
-(define_expand "vec_initv2sf"
+(define_expand "vec_initv2sfsf"
[(match_operand:V2SF 0 "fr_register_operand" "")
(match_operand 1 "" "")]
""
operands[1] = gen_rtx_REG (SFmode, REGNO (operands[1]));
})
-(define_expand "vec_extractv2sf"
+(define_expand "vec_extractv2sfsf"
[(set (match_operand:SF 0 "register_operand" "")
(unspec:SF [(match_operand:V2SF 1 "register_operand" "")
(match_operand:DI 2 "const_int_operand" "")]
;; Initialization of a vector.
-(define_expand "vec_init<mode>"
+(define_expand "vec_init<mode><unitmode>"
[(set (match_operand:VWHB 0 "register_operand")
(match_operand 1 ""))]
"TARGET_HARD_FLOAT && TARGET_LOONGSON_VECTORS"
(V4SI "uimm5")
(V2DI "uimm6")])
-(define_expand "vec_init<mode>"
+(define_expand "vec_init<mode><unitmode>"
[(match_operand:MSA 0 "register_operand")
(match_operand:MSA 1 "")]
"ISA_HAS_MSA"
DONE;
})
-(define_expand "vec_extract<mode>"
+(define_expand "vec_extract<mode><unitmode>"
[(match_operand:<UNITMODE> 0 "register_operand")
(match_operand:IMSA 1 "register_operand")
(match_operand 2 "const_<indeximm>_operand")]
DONE;
})
-(define_expand "vec_extract<mode>"
+(define_expand "vec_extract<mode><unitmode>"
[(match_operand:<UNITMODE> 0 "register_operand")
(match_operand:FMSA 1 "register_operand")
(match_operand 2 "const_<indeximm>_operand")]
})
; vec_init
-(define_expand "vec_initv2sf"
+(define_expand "vec_initv2sfsf"
[(match_operand:V2SF 0 "register_operand")
(match_operand:V2SF 1 "")]
"TARGET_HARD_FLOAT && TARGET_PAIRED_SINGLE_FLOAT"
;; emulated. There is no other way to get a vector mode bitfield extract
;; currently.
-(define_insn "vec_extractv2sf"
+(define_insn "vec_extractv2sfsf"
[(set (match_operand:SF 0 "register_operand" "=f")
(vec_select:SF (match_operand:V2SF 1 "register_operand" "f")
(parallel
rtx temp = gen_reg_rtx (V2SFmode);
emit_insn (gen_mips_addr_ps (temp, operands[1], operands[1]));
rtx lane = BYTES_BIG_ENDIAN ? const1_rtx : const0_rtx;
- emit_insn (gen_vec_extractv2sf (operands[0], temp, lane));
+ emit_insn (gen_vec_extractv2sfsf (operands[0], temp, lane));
DONE;
})
rtx temp = gen_reg_rtx (V2SFmode);
mips_expand_vec_reduc (temp, operands[1], gen_sminv2sf3);
rtx lane = BYTES_BIG_ENDIAN ? const1_rtx : const0_rtx;
- emit_insn (gen_vec_extractv2sf (operands[0], temp, lane));
+ emit_insn (gen_vec_extractv2sfsf (operands[0], temp, lane));
DONE;
})
rtx temp = gen_reg_rtx (V2SFmode);
mips_expand_vec_reduc (temp, operands[1], gen_smaxv2sf3);
rtx lane = BYTES_BIG_ENDIAN ? const1_rtx : const0_rtx;
- emit_insn (gen_vec_extractv2sf (operands[0], temp, lane));
+ emit_insn (gen_vec_extractv2sfsf (operands[0], temp, lane));
DONE;
})
(V16QI "QI") (V8HI "HI") (V4SI "SI") (V2DI "DI")
(V2DF "DF")])
+;; As above, but in lower case.
+(define_mode_attr unitmode [(SF "sf") (DF "df") (V2SF "sf") (V4SF "sf")
+ (V16QI "qi") (V8QI "qi") (V8HI "hi") (V4HI "hi")
+ (V4SI "si") (V2SI "si") (V2DI "di") (V2DF "df")])
+
;; This attribute gives the integer mode that has the same size as a
;; fixed-point mode.
(define_mode_attr IMODE [(QQ "QI") (HQ "HI") (SQ "SI") (DQ "DI")
for (i = 0; i < num_elements; i++)
RTVEC_ELT (v, i) = constm1_rtx;
- emit_insn (gen_vec_initv4si (dest, gen_rtx_PARALLEL (mode, v)));
+ emit_insn (gen_vec_initv4sisi (dest, gen_rtx_PARALLEL (mode, v)));
emit_insn (gen_rtx_SET (dest, gen_rtx_ASHIFT (mode, dest, dest)));
DONE;
})
RTVEC_ELT (v, 2) = GEN_INT (mask_val);
RTVEC_ELT (v, 3) = GEN_INT (mask_val);
- emit_insn (gen_vec_initv4si (mask, gen_rtx_PARALLEL (V4SImode, v)));
+ emit_insn (gen_vec_initv4sisi (mask, gen_rtx_PARALLEL (V4SImode, v)));
emit_insn (gen_vector_select_v4sf (operands[0], operands[1], operands[2],
gen_lowpart (V4SFmode, mask)));
DONE;
RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 : 0);
RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 7 : 16);
- emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
+ emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask));
DONE;
}")
RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 6 : 17);
RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 7 : 16);
- emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
+ emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
emit_insn (gen_vperm_v8hiv4si (operands[0], operands[1], vzero, mask));
DONE;
}")
RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 : 8);
RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 15 : 16);
- emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
+ emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask));
DONE;
}")
RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 14 : 17);
RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 15 : 16);
- emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
+ emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
emit_insn (gen_vperm_v8hiv4si (operands[0], operands[1], vzero, mask));
DONE;
}")
= gen_rtx_CONST_INT (QImode, BYTES_BIG_ENDIAN ? 2 * i + 17 : 15 - 2 * i);
}
- emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
+ emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
emit_insn (gen_altivec_vmulesb (even, operands[1], operands[2]));
emit_insn (gen_altivec_vmulosb (odd, operands[1], operands[2]));
emit_insn (gen_altivec_vperm_v8hiv16qi (operands[0], even, odd, mask));
"ps_muls1 %0, %1, %2"
[(set_attr "type" "fp")])
-(define_expand "vec_initv2sf"
+(define_expand "vec_initv2sfsf"
[(match_operand:V2SF 0 "gpc_reg_operand" "=f")
(match_operand 1 "" "")]
"TARGET_PAIRED_FLOAT"
(V1TI "TI")
(TI "TI")])
+;; As above, but in lower case
+(define_mode_attr VEC_base_l [(V16QI "qi")
+ (V8HI "hi")
+ (V4SI "si")
+ (V2DI "di")
+ (V4SF "sf")
+ (V2DF "df")
+ (V1TI "ti")
+ (TI "ti")])
+
;; Same size integer type for floating point data
(define_mode_attr VEC_int [(V4SF "v4si")
(V2DF "v2di")])
\f
;; Vector initialization, set, extract
-(define_expand "vec_init<mode>"
+(define_expand "vec_init<mode><VEC_base_l>"
[(match_operand:VEC_E 0 "vlogical_operand" "")
(match_operand:VEC_E 1 "" "")]
"VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
DONE;
})
-(define_expand "vec_extract<mode>"
+(define_expand "vec_extract<mode><VEC_base_l>"
[(match_operand:<VEC_base> 0 "register_operand" "")
(match_operand:VEC_E 1 "vlogical_operand" "")
(match_operand 2 "const_int_operand" "")]
for (i = 0; i < num_elements; i++)
RTVEC_ELT (v, i) = constm1_rtx;
- emit_insn (gen_vec_initv4si (dest, gen_rtx_PARALLEL (mode, v)));
+ emit_insn (gen_vec_initv4sisi (dest, gen_rtx_PARALLEL (mode, v)));
emit_insn (gen_rtx_SET (dest, gen_rtx_ASHIFT (mode, dest, dest)));
DONE;
})
RTVEC_ELT (v, 2) = GEN_INT (mask_val);
RTVEC_ELT (v, 3) = GEN_INT (mask_val);
- emit_insn (gen_vec_initv4si (mask, gen_rtx_PARALLEL (V4SImode, v)));
+ emit_insn (gen_vec_initv4sisi (mask, gen_rtx_PARALLEL (V4SImode, v)));
emit_insn (gen_vector_select_v4sf (operands[0], operands[1], operands[2],
gen_lowpart (V4SFmode, mask)));
DONE;
RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 : 0);
RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 7 : 16);
- emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
+ emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask));
DONE;
}")
RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 6 : 17);
RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 7 : 16);
- emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
+ emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
emit_insn (gen_vperm_v8hiv4si (operands[0], operands[1], vzero, mask));
DONE;
}")
RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 16 : 8);
RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 15 : 16);
- emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
+ emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
emit_insn (gen_vperm_v16qiv8hi (operands[0], operands[1], vzero, mask));
DONE;
}")
RTVEC_ELT (v, 14) = gen_rtx_CONST_INT (QImode, be ? 14 : 17);
RTVEC_ELT (v, 15) = gen_rtx_CONST_INT (QImode, be ? 15 : 16);
- emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
+ emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
emit_insn (gen_vperm_v8hiv4si (operands[0], operands[1], vzero, mask));
DONE;
}")
= gen_rtx_CONST_INT (QImode, BYTES_BIG_ENDIAN ? 2 * i + 17 : 15 - 2 * i);
}
- emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
+ emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
emit_insn (gen_altivec_vmulesb (even, operands[1], operands[2]));
emit_insn (gen_altivec_vmulosb (odd, operands[1], operands[2]));
emit_insn (gen_altivec_vperm_v8hiv16qi (operands[0], even, odd, mask));
RTVEC_ELT (v, i + j * size)
= GEN_INT (i + (num_elements - 1 - j) * size);
- emit_insn (gen_vec_initv16qi (mask, gen_rtx_PARALLEL (V16QImode, v)));
+ emit_insn (gen_vec_initv16qiqi (mask, gen_rtx_PARALLEL (V16QImode, v)));
emit_insn (gen_altivec_vperm_<mode> (operands[0], operands[1],
operands[1], mask));
DONE;
"ps_muls1 %0, %1, %2"
[(set_attr "type" "fp")])
-(define_expand "vec_initv2sf"
+(define_expand "vec_initv2sfsf"
[(match_operand:V2SF 0 "gpc_reg_operand" "=f")
(match_operand 1 "" "")]
"TARGET_PAIRED_FLOAT"
(V1TI "TI")
(TI "TI")])
+;; As above, but in lower case
+(define_mode_attr VEC_base_l [(V16QI "qi")
+ (V8HI "hi")
+ (V4SI "si")
+ (V2DI "di")
+ (V4SF "sf")
+ (V2DF "df")
+ (V1TI "ti")
+ (TI "ti")])
+
;; Same size integer type for floating point data
(define_mode_attr VEC_int [(V4SF "v4si")
(V2DF "v2di")])
\f
;; Vector initialization, set, extract
-(define_expand "vec_init<mode>"
+(define_expand "vec_init<mode><VEC_base_l>"
[(match_operand:VEC_E 0 "vlogical_operand" "")
(match_operand:VEC_E 1 "" "")]
"VECTOR_MEM_ALTIVEC_OR_VSX_P (<MODE>mode)"
DONE;
})
-(define_expand "vec_extract<mode>"
+(define_expand "vec_extract<mode><VEC_base_l>"
[(match_operand:<VEC_base> 0 "register_operand" "")
(match_operand:VEC_E 1 "vlogical_operand" "")
(match_operand 2 "const_int_operand" "")]
OB_DEF_VAR (s390_vec_extract_b64, s390_vlgvg, 0, O2_ELEM, BT_OV_ULONGLONG_BV2DI_INT)
OB_DEF_VAR (s390_vec_extract_dbl, s390_vlgvg_dbl, 0, O2_ELEM, BT_OV_DBL_V2DF_INT) /* vlgvg */
-B_DEF (s390_vlgvb, vec_extractv16qi, 0, B_VX, O2_ELEM, BT_FN_UCHAR_UV16QI_INT)
-B_DEF (s390_vlgvh, vec_extractv8hi, 0, B_VX, O2_ELEM, BT_FN_USHORT_UV8HI_INT)
-B_DEF (s390_vlgvf, vec_extractv4si, 0, B_VX, O2_ELEM, BT_FN_UINT_UV4SI_INT)
-B_DEF (s390_vlgvf_flt, vec_extractv4sf, 0, B_INT | B_VXE, O2_ELEM, BT_FN_FLT_V4SF_INT)
-B_DEF (s390_vlgvg, vec_extractv2di, 0, B_VX, O2_ELEM, BT_FN_ULONGLONG_UV2DI_INT)
-B_DEF (s390_vlgvg_dbl, vec_extractv2df, 0, B_INT | B_VX, O2_ELEM, BT_FN_DBL_V2DF_INT)
+B_DEF (s390_vlgvb, vec_extractv16qiqi, 0, B_VX, O2_ELEM, BT_FN_UCHAR_UV16QI_INT)
+B_DEF (s390_vlgvh, vec_extractv8hihi, 0, B_VX, O2_ELEM, BT_FN_USHORT_UV8HI_INT)
+B_DEF (s390_vlgvf, vec_extractv4sisi, 0, B_VX, O2_ELEM, BT_FN_UINT_UV4SI_INT)
+B_DEF (s390_vlgvf_flt, vec_extractv4sfsf, 0, B_INT | B_VXE, O2_ELEM, BT_FN_FLT_V4SF_INT)
+B_DEF (s390_vlgvg, vec_extractv2didi, 0, B_VX, O2_ELEM, BT_FN_ULONGLONG_UV2DI_INT)
+B_DEF (s390_vlgvg_dbl, vec_extractv2dfdf, 0, B_INT | B_VX, O2_ELEM, BT_FN_DBL_V2DF_INT)
OB_DEF (s390_vec_insert_and_zero, s390_vec_insert_and_zero_s8,s390_vec_insert_and_zero_dbl,B_VX,BT_FN_OV4SI_INTCONSTPTR)
OB_DEF_VAR (s390_vec_insert_and_zero_s8,s390_vllezb, 0, 0, BT_OV_V16QI_SCHARCONSTPTR)
add_int_reg_note (s390_emit_ccraw_jump (8, NE, loop_start_label),
REG_BR_PROB,
profile_probability::very_likely ().to_reg_br_prob_note ());
- emit_insn (gen_vec_extractv16qi (len, result_reg, GEN_INT (7)));
+ emit_insn (gen_vec_extractv16qiqi (len, result_reg, GEN_INT (7)));
/* If the string pointer wasn't aligned we have loaded less then 16
bytes and the remaining bytes got filled with zeros (by vll).
emit_insn (gen_vlbb (vsrc, src, GEN_INT (6)));
emit_insn (gen_lcbb (loadlen, src_addr, GEN_INT (6)));
emit_insn (gen_vfenezv16qi (vpos, vsrc, vsrc));
- emit_insn (gen_vec_extractv16qi (gpos_qi, vpos, GEN_INT (7)));
+ emit_insn (gen_vec_extractv16qiqi (gpos_qi, vpos, GEN_INT (7)));
emit_move_insn (gpos, gen_rtx_SUBREG (SImode, gpos_qi, 0));
/* gpos is the byte index if a zero was found and 16 otherwise.
So if it is lower than the loaded bytes we have a hit. */
force_expand_binop (Pmode, add_optab, dst_addr_reg, offset, dst_addr_reg,
1, OPTAB_DIRECT);
- emit_insn (gen_vec_extractv16qi (gpos_qi, vpos, GEN_INT (7)));
+ emit_insn (gen_vec_extractv16qiqi (gpos_qi, vpos, GEN_INT (7)));
emit_move_insn (gpos, gen_rtx_SUBREG (SImode, gpos_qi, 0));
emit_insn (gen_vstlv16qi (vsrc, gpos, gen_rtx_MEM (BLKmode, dst_addr_reg)));
(V1DF "DF") (V2DF "DF")
(V1TF "TF") (TF "TF")])
+; Like above, but in lower case.
+(define_mode_attr non_vec_l[(V1QI "qi") (V2QI "qi") (V4QI "qi") (V8QI "qi")
+ (V16QI "qi")
+ (V1HI "hi") (V2HI "hi") (V4HI "hi") (V8HI "hi")
+ (V1SI "si") (V2SI "si") (V4SI "si")
+ (V1DI "di") (V2DI "di")
+ (V1TI "ti") (TI "ti")
+ (V1SF "sf") (V2SF "sf") (V4SF "sf")
+ (V1DF "df") (V2DF "df")
+ (V1TF "tf") (TF "tf")])
+
; The instruction suffix for integer instructions and instructions
; which do not care about whether it is floating point or integer.
(define_mode_attr bhfgq[(V1QI "b") (V2QI "b") (V4QI "b") (V8QI "b") (V16QI "b")
; FIXME: Support also vector mode operands for 0
; FIXME: This should be (vec_select ..) or something but it does only allow constant selectors :(
; This is used via RTL standard name as well as for expanding the builtin
-(define_expand "vec_extract<mode>"
+(define_expand "vec_extract<mode><non_vec_l>"
[(set (match_operand:<non_vec> 0 "nonimmediate_operand" "")
(unspec:<non_vec> [(match_operand:V 1 "register_operand" "")
(match_operand:SI 2 "nonmemory_operand" "")]
"vlgv<bhfgq>\t%0,%v1,%Y3(%2)"
[(set_attr "op_type" "VRS")])
-(define_expand "vec_init<mode>"
+(define_expand "vec_init<mode><non_vec_l>"
[(match_operand:V_128 0 "register_operand" "")
(match_operand:V_128 1 "nonmemory_operand" "")]
"TARGET_VX"
(define_mode_attr vfptype [(V1SI "single") (V2HI "single") (V4QI "single")
(V1DI "double") (V2SI "double") (V4HI "double")
(V8QI "double")])
+(define_mode_attr veltmode [(V1SI "si") (V2HI "hi") (V4QI "qi") (V1DI "di")
+ (V2SI "si") (V4HI "hi") (V8QI "qi")])
(define_expand "mov<VMALL:mode>"
[(set (match_operand:VMALL 0 "nonimmediate_operand" "")
DONE;
})
-(define_expand "vec_init<VMALL:mode>"
+(define_expand "vec_init<VMALL:mode><VMALL:veltmode>"
[(match_operand:VMALL 0 "register_operand" "")
(match_operand:VMALL 1 "" "")]
"TARGET_VIS"
size_v4si = scratch_v4si;
}
emit_insn (gen_cgt_v4si (scratch_v4si, sp_v4si, size_v4si));
- emit_insn (gen_vec_extractv4si
+ emit_insn (gen_vec_extractv4sisi
(scratch_reg_0, scratch_v4si, GEN_INT (1)));
emit_insn (gen_spu_heq (scratch_reg_0, GEN_INT (0)));
}
{
rtx avail = gen_reg_rtx(SImode);
rtx result = gen_reg_rtx(SImode);
- emit_insn (gen_vec_extractv4si (avail, sp, GEN_INT (1)));
+ emit_insn (gen_vec_extractv4sisi (avail, sp, GEN_INT (1)));
emit_insn (gen_cgt_si(result, avail, GEN_INT (-1)));
emit_insn (gen_spu_heq (result, GEN_INT(0) ));
}
switch (mode)
{
case V16QImode:
- emit_insn (gen_vec_extractv16qi (ops[0], ops[1], ops[2]));
+ emit_insn (gen_vec_extractv16qiqi (ops[0], ops[1], ops[2]));
break;
case V8HImode:
- emit_insn (gen_vec_extractv8hi (ops[0], ops[1], ops[2]));
+ emit_insn (gen_vec_extractv8hihi (ops[0], ops[1], ops[2]));
break;
case V4SFmode:
- emit_insn (gen_vec_extractv4sf (ops[0], ops[1], ops[2]));
+ emit_insn (gen_vec_extractv4sfsf (ops[0], ops[1], ops[2]));
break;
case V4SImode:
- emit_insn (gen_vec_extractv4si (ops[0], ops[1], ops[2]));
+ emit_insn (gen_vec_extractv4sisi (ops[0], ops[1], ops[2]));
break;
case V2DImode:
- emit_insn (gen_vec_extractv2di (ops[0], ops[1], ops[2]));
+ emit_insn (gen_vec_extractv2didi (ops[0], ops[1], ops[2]));
break;
case V2DFmode:
- emit_insn (gen_vec_extractv2df (ops[0], ops[1], ops[2]));
+ emit_insn (gen_vec_extractv2dfdf (ops[0], ops[1], ops[2]));
break;
default:
abort ();
(V2DI "DI")
(V4SF "SF")
(V2DF "DF")])
+;; Like above, but in lower case
+(define_mode_attr inner_l [(V16QI "qi")
+ (V8HI "hi")
+ (V4SI "si")
+ (V2DI "di")
+ (V4SF "sf")
+ (V2DF "df")])
(define_mode_attr vmult [(V16QI "1")
(V8HI "2")
(V4SI "4")
;; vector patterns
;; Vector initialization
-(define_expand "vec_init<mode>"
+(define_expand "vec_init<mode><inner_l>"
[(match_operand:V 0 "register_operand" "")
(match_operand 1 "" "")]
""
operands[6] = GEN_INT (size);
})
-(define_expand "vec_extract<mode>"
+(define_expand "vec_extract<mode><inner_l>"
[(set (match_operand:<inner> 0 "spu_reg_operand" "=r")
(vec_select:<inner> (match_operand:V 1 "spu_reg_operand" "r")
(parallel [(match_operand 2 "const_int_operand" "i")])))]
Set given field in the vector value. Operand 0 is the vector to modify,
operand 1 is new value of field and operand 2 specify the field index.
-@cindex @code{vec_extract@var{m}} instruction pattern
-@item @samp{vec_extract@var{m}}
+@cindex @code{vec_extract@var{m}@var{n}} instruction pattern
+@item @samp{vec_extract@var{m}@var{n}}
Extract given field from the vector value. Operand 1 is the vector, operand 2
-specify field index and operand 0 place to store value into.
-
-@cindex @code{vec_init@var{m}} instruction pattern
-@item @samp{vec_init@var{m}}
+specify field index and operand 0 place to store value into. The
+@var{n} mode is the mode of the field or vector of fields that should be
+extracted, should be either element mode of the vector mode @var{m}, or
+a vector mode with the same element mode and smaller number of elements.
+If @var{n} is a vector mode, the index is counted in units of that mode.
+
+@cindex @code{vec_init@var{m}@var{n}} instruction pattern
+@item @samp{vec_init@var{m}@var{n}}
Initialize the vector to given values. Operand 0 is the vector to initialize
-and operand 1 is parallel containing values for individual fields.
+and operand 1 is parallel containing values for individual fields. The
+@var{n} mode is the mode of the elements, should be either element mode of
+the vector mode @var{m}, or a vector mode with the same element mode and
+smaller number of elements.
@cindex @code{vec_cmp@var{m}@var{n}} instruction pattern
@item @samp{vec_cmp@var{m}@var{n}}
return op0;
}
+ /* First try to check for vector from vector extractions. */
+ if (VECTOR_MODE_P (GET_MODE (op0))
+ && !MEM_P (op0)
+ && VECTOR_MODE_P (tmode)
+ && GET_MODE_SIZE (GET_MODE (op0)) > GET_MODE_SIZE (tmode))
+ {
+ machine_mode new_mode = GET_MODE (op0);
+ if (GET_MODE_INNER (new_mode) != GET_MODE_INNER (tmode))
+ {
+ new_mode = mode_for_vector (GET_MODE_INNER (tmode),
+ GET_MODE_BITSIZE (GET_MODE (op0))
+ / GET_MODE_UNIT_BITSIZE (tmode));
+ if (!VECTOR_MODE_P (new_mode)
+ || GET_MODE_SIZE (new_mode) != GET_MODE_SIZE (GET_MODE (op0))
+ || GET_MODE_INNER (new_mode) != GET_MODE_INNER (tmode)
+ || !targetm.vector_mode_supported_p (new_mode))
+ new_mode = VOIDmode;
+ }
+ if (new_mode != VOIDmode
+ && (convert_optab_handler (vec_extract_optab, new_mode, tmode)
+ != CODE_FOR_nothing)
+ && ((bitnum + bitsize - 1) / GET_MODE_BITSIZE (tmode)
+ == bitnum / GET_MODE_BITSIZE (tmode)))
+ {
+ struct expand_operand ops[3];
+ machine_mode outermode = new_mode;
+ machine_mode innermode = tmode;
+ enum insn_code icode
+ = convert_optab_handler (vec_extract_optab, outermode, innermode);
+ unsigned HOST_WIDE_INT pos = bitnum / GET_MODE_BITSIZE (innermode);
+
+ if (new_mode != GET_MODE (op0))
+ op0 = gen_lowpart (new_mode, op0);
+ create_output_operand (&ops[0], target, innermode);
+ ops[0].target = 1;
+ create_input_operand (&ops[1], op0, outermode);
+ create_integer_operand (&ops[2], pos);
+ if (maybe_expand_insn (icode, 3, ops))
+ {
+ if (alt_rtl && ops[0].target)
+ *alt_rtl = target;
+ target = ops[0].value;
+ if (GET_MODE (target) != mode)
+ return gen_lowpart (tmode, target);
+ return target;
+ }
+ }
+ }
+
/* See if we can get a better vector mode before extracting. */
if (VECTOR_MODE_P (GET_MODE (op0))
&& !MEM_P (op0)
available. */
if (VECTOR_MODE_P (GET_MODE (op0))
&& !MEM_P (op0)
- && optab_handler (vec_extract_optab, GET_MODE (op0)) != CODE_FOR_nothing
+ && (convert_optab_handler (vec_extract_optab, GET_MODE (op0),
+ GET_MODE_INNER (GET_MODE (op0)))
+ != CODE_FOR_nothing)
&& ((bitnum + bitsize - 1) / GET_MODE_UNIT_BITSIZE (GET_MODE (op0))
== bitnum / GET_MODE_UNIT_BITSIZE (GET_MODE (op0))))
{
struct expand_operand ops[3];
machine_mode outermode = GET_MODE (op0);
machine_mode innermode = GET_MODE_INNER (outermode);
- enum insn_code icode = optab_handler (vec_extract_optab, outermode);
+ enum insn_code icode
+ = convert_optab_handler (vec_extract_optab, outermode, innermode);
unsigned HOST_WIDE_INT pos = bitnum / GET_MODE_BITSIZE (innermode);
create_output_operand (&ops[0], target, innermode);
rtvec vector = NULL;
unsigned n_elts;
alias_set_type alias;
+ bool vec_vec_init_p = false;
gcc_assert (eltmode != BLKmode);
if (REG_P (target) && VECTOR_MODE_P (GET_MODE (target)))
{
machine_mode mode = GET_MODE (target);
+ machine_mode emode = eltmode;
- icode = (int) optab_handler (vec_init_optab, mode);
- /* Don't use vec_init<mode> if some elements have VECTOR_TYPE. */
- if (icode != CODE_FOR_nothing)
+ if (CONSTRUCTOR_NELTS (exp)
+ && (TREE_CODE (TREE_TYPE (CONSTRUCTOR_ELT (exp, 0)->value))
+ == VECTOR_TYPE))
{
- tree value;
-
- FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (exp), idx, value)
- if (TREE_CODE (TREE_TYPE (value)) == VECTOR_TYPE)
- {
- icode = CODE_FOR_nothing;
- break;
- }
+ tree etype = TREE_TYPE (CONSTRUCTOR_ELT (exp, 0)->value);
+ gcc_assert (CONSTRUCTOR_NELTS (exp) * TYPE_VECTOR_SUBPARTS (etype)
+ == n_elts);
+ emode = TYPE_MODE (etype);
}
+ icode = (int) convert_optab_handler (vec_init_optab, mode, emode);
if (icode != CODE_FOR_nothing)
{
- unsigned int i;
+ unsigned int i, n = n_elts;
- vector = rtvec_alloc (n_elts);
- for (i = 0; i < n_elts; i++)
- RTVEC_ELT (vector, i) = CONST0_RTX (GET_MODE_INNER (mode));
+ if (emode != eltmode)
+ {
+ n = CONSTRUCTOR_NELTS (exp);
+ vec_vec_init_p = true;
+ }
+ vector = rtvec_alloc (n);
+ for (i = 0; i < n; i++)
+ RTVEC_ELT (vector, i) = CONST0_RTX (emode);
}
}
FOR_EACH_CONSTRUCTOR_VALUE (CONSTRUCTOR_ELTS (exp), idx, value)
{
- int n_elts_here = tree_to_uhwi
- (int_const_binop (TRUNC_DIV_EXPR,
- TYPE_SIZE (TREE_TYPE (value)),
- TYPE_SIZE (elttype)));
+ tree sz = TYPE_SIZE (TREE_TYPE (value));
+ int n_elts_here
+ = tree_to_uhwi (int_const_binop (TRUNC_DIV_EXPR, sz,
+ TYPE_SIZE (elttype)));
count += n_elts_here;
if (mostly_zeros_p (value))
if (vector)
{
- /* vec_init<mode> should not be used if there are VECTOR_TYPE
- elements. */
- gcc_assert (TREE_CODE (TREE_TYPE (value)) != VECTOR_TYPE);
- RTVEC_ELT (vector, eltpos)
- = expand_normal (value);
+ if (vec_vec_init_p)
+ {
+ gcc_assert (ce->index == NULL_TREE);
+ gcc_assert (TREE_CODE (TREE_TYPE (value)) == VECTOR_TYPE);
+ eltpos = idx;
+ }
+ else
+ gcc_assert (TREE_CODE (TREE_TYPE (value)) != VECTOR_TYPE);
+ RTVEC_ELT (vector, eltpos) = expand_normal (value);
}
else
{
- machine_mode value_mode =
- TREE_CODE (TREE_TYPE (value)) == VECTOR_TYPE
- ? TYPE_MODE (TREE_TYPE (value))
- : eltmode;
+ machine_mode value_mode
+ = (TREE_CODE (TREE_TYPE (value)) == VECTOR_TYPE
+ ? TYPE_MODE (TREE_TYPE (value)) : eltmode);
bitpos = eltpos * elt_size;
store_constructor_field (target, bitsize, bitpos, 0,
bitregion_end, value_mode,
}
if (vector)
- emit_insn (GEN_FCN (icode)
- (target,
- gen_rtx_PARALLEL (GET_MODE (target), vector)));
+ emit_insn (GEN_FCN (icode) (target,
+ gen_rtx_PARALLEL (GET_MODE (target),
+ vector)));
break;
}
/* ??? If the target doesn't have a vec_init, then we have no easy way
of performing this operation. Most of this sort of generic support
is hidden away in the vector lowering support in gimple. */
- icode = optab_handler (vec_init_optab, vmode);
+ icode = convert_optab_handler (vec_init_optab, vmode,
+ GET_MODE_INNER (vmode));
if (icode == CODE_FOR_nothing)
return NULL;
OPTAB_CD(vec_cmpeq_optab, "vec_cmpeq$a$b")
OPTAB_CD(maskload_optab, "maskload$a$b")
OPTAB_CD(maskstore_optab, "maskstore$a$b")
+OPTAB_CD(vec_extract_optab, "vec_extract$a$b")
+OPTAB_CD(vec_init_optab, "vec_init$a$b")
OPTAB_NL(add_optab, "add$P$a3", PLUS, "add", '3', gen_int_fp_fixed_libfunc)
OPTAB_NX(add_optab, "add$F$a3")
OPTAB_D (usum_widen_optab, "widen_usum$I$a3")
OPTAB_D (usad_optab, "usad$I$a")
OPTAB_D (ssad_optab, "ssad$I$a")
-OPTAB_D (vec_extract_optab, "vec_extract$a")
-OPTAB_D (vec_init_optab, "vec_init$a")
OPTAB_D (vec_pack_sfix_trunc_optab, "vec_pack_sfix_trunc_$a")
OPTAB_D (vec_pack_ssat_optab, "vec_pack_ssat_$a")
OPTAB_D (vec_pack_trunc_optab, "vec_pack_trunc_$a")
{
if (group_size < nunits)
{
- /* Avoid emitting a constructor of vector elements by performing
- the loads using an integer type of the same size,
- constructing a vector of those and then re-interpreting it
- as the original vector type. This works around the fact
- that the vec_init optab was only designed for scalar
- element modes and thus expansion goes through memory.
- This avoids a huge runtime penalty due to the general
- inability to perform store forwarding from smaller stores
- to a larger load. */
- unsigned lsize
- = group_size * TYPE_PRECISION (TREE_TYPE (vectype));
- machine_mode elmode = mode_for_size (lsize, MODE_INT, 0);
- machine_mode vmode = mode_for_vector (elmode,
- nunits / group_size);
- /* If we can't construct such a vector fall back to
- element loads of the original vector type. */
+ /* First check if vec_init optab supports construction from
+ vector elts directly. */
+ machine_mode elmode = TYPE_MODE (TREE_TYPE (vectype));
+ machine_mode vmode = mode_for_vector (elmode, group_size);
if (VECTOR_MODE_P (vmode)
- && optab_handler (vec_init_optab, vmode) != CODE_FOR_nothing)
+ && (convert_optab_handler (vec_init_optab,
+ TYPE_MODE (vectype), vmode)
+ != CODE_FOR_nothing))
{
nloads = nunits / group_size;
lnel = group_size;
- ltype = build_nonstandard_integer_type (lsize, 1);
- lvectype = build_vector_type (ltype, nloads);
+ ltype = build_vector_type (TREE_TYPE (vectype), group_size);
+ }
+ else
+ {
+ /* Otherwise avoid emitting a constructor of vector elements
+ by performing the loads using an integer type of the same
+ size, constructing a vector of those and then
+ re-interpreting it as the original vector type.
+ This avoids a huge runtime penalty due to the general
+ inability to perform store forwarding from smaller stores
+ to a larger load. */
+ unsigned lsize
+ = group_size * TYPE_PRECISION (TREE_TYPE (vectype));
+ elmode = mode_for_size (lsize, MODE_INT, 0);
+ vmode = mode_for_vector (elmode, nunits / group_size);
+ /* If we can't construct such a vector fall back to
+ element loads of the original vector type. */
+ if (VECTOR_MODE_P (vmode)
+ && (convert_optab_handler (vec_init_optab, vmode, elmode)
+ != CODE_FOR_nothing))
+ {
+ nloads = nunits / group_size;
+ lnel = group_size;
+ ltype = build_nonstandard_integer_type (lsize, 1);
+ lvectype = build_vector_type (ltype, nloads);
+ }
}
}
else