vsx.md (VSINT_84): Add DImode to enable loading DImode constants with XXSPLTIB in vector registers.

[gcc]
2016-06-15  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* config/rs6000/vsx.md (VSINT_84): Add DImode to enable loading
	DImode constants with XXSPLTIB in vector registers.
	(vsx_extract_<mode>, V2DImode/V2DFmode): Combine both
	vsx_extract_<mode>_internal{1,2} into a single insn that handles
	direct move (both ISA 2.07 and ISA 3.0 versions), and optimizes
	extraction of the element at the top of the register as a scalar
	value.
	(vsx_extract_<mode>_internal1): Likewise.
	(vsx_extract_<mode>_internal2): Likewise.
	* config/rs6000/constraints.md (wi constraint): Remove a comment
	about DImode not being allowed in Altivec registers.
	(wB constraint): New constraint for constants that can be
	generated in Altivec registers with VSPLTISW/VUPKHSW.
	* config/rs6000/predicates.md (xxspltib_constant_split): Update
	comments.
	(xxspltib_constant_nosplit): Likewise.
	* config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Add
	support for -mupper-regs-di to enable DImode to go into Altivec
	registers.
	(POWERPC_MASKS): Likewise.
	(power7 cpu): Likewise.
	* config/rs6000/rs6000.opt (-mupper-regs-di): Likewise.
	* config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Add support
	for DImode being allowed in Altivec registers.  Update wi/wj
	constraints.  Set scalar_in_vmx_p flag.
	(rs6000_option_override_internal): Add checks for -mupper-regs-di.
	(xxspltib_constant_p): Allow CONST_INT's with VOIDmode.  Don't
	return true if we could use VSPLTISW/VUPKHSW instead of XXSPLTIB.
	(rs6000_opt_masks): Add -mupper-regs-di.
	* config/rs6000/rs6000.md (lfiwax): Update clobbers that don't use
	direct move to use wi and not wj.
	(lfiwzx): Likewise.
	(floatsi<mode>2_lfiwax_mem): Combine alternatives into a single
	alternative.
	(floatunssi<mode>2_lfiwzx_mem): Likewise.
	(fix_trunc<mode>di2_fctidz): Change second alternative to allow
	any VSX register, instead of just Altivec registers, to allow
	either operand to be an Altivec register or both.
	(fixuns_trunc<mode>di2_fctiduz): Likewise.
	(movdi_internal32): Add support for -mupper-regs-di.  Add support
	to load constants via XXSPLTIB or VSPLTISW.  Add spacing to allow
	the alternatives and attributes to be lined up to be easier to
	read.
	(movdi_internal64): Likewise.
	(64-bit DImode splitters): Change predicates to only split loading
	up GPR registers.  Add splits for using XXSPLTIB or VSPLTISW to
	load constants in ISA 3.0 or ISA 2.07 respectively.
	* doc/invoke.texi (RS/6000 and PowerPC Options): Document
	-mupper-regs-di.  Update -mupper-regs-df and -mupper-regs-sf to
	mention -mcpu=power9 sets these options.
	* doc/md.texi (PowerPC and IBM RS6000 constraints): Document the
	wB constraint.

[gcc/testsuite]
2016-06-15  Michael Meissner  <meissner@linux.vnet.ibm.com>

	* gcc.target/powerpc/p9-dimode1.c: New test.
	* gcc.target/powerpc/p9-dimode2.c: Likewise.

From-SVN: r237490
This commit is contained in:
Michael Meissner 2016-06-15 18:17:58 +00:00 committed by Michael Meissner
parent 61daecc46b
commit 1a3c3ee9bc
13 changed files with 397 additions and 100 deletions

View File

@ -1,3 +1,58 @@
2016-06-15 Michael Meissner <meissner@linux.vnet.ibm.com>
* config/rs6000/vsx.md (VSINT_84): Add DImode to enable loading
DImode constants with XXSPLTIB in vector registers.
(vsx_extract_<mode>, V2DImode/V2DFmode): Combine both
vsx_extract_<mode>_internal{1,2} into a single insn that handles
direct move (both ISA 2.07 and ISA 3.0 versions), and optimizes
extraction of the element at the top of the register as a scalar
value.
(vsx_extract_<mode>_internal1): Likewise.
(vsx_extract_<mode>_internal2): Likewise.
* config/rs6000/constraints.md (wi constraint): Remove a comment
about DImode not being allowed in Altivec registers.
(wB constraint): New constraint for constants that can be
generated in Altivec registers with VSPLTISW/VUPKHSW.
* config/rs6000/predicates.md (xxspltib_constant_split): Update
comments.
(xxspltib_constant_nosplit): Likewise.
* config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Add
support for -mupper-regs-di to enable DImode to go into Altivec
registers.
(POWERPC_MASKS): Likewise.
(power7 cpu): Likewise.
* config/rs6000/rs6000.opt (-mupper-regs-di): Likewise.
* config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Add support
for DImode being allowed in Altivec registers. Update wi/wj
constraints. Set scalar_in_vmx_p flag.
(rs6000_option_override_internal): Add checks for -mupper-regs-di.
(xxspltib_constant_p): Allow CONST_INT's with VOIDmode. Don't
return true if we could use VSPLTISW/VUPKHSW instead of XXSPLTIB.
(rs6000_opt_masks): Add -mupper-regs-di.
* config/rs6000/rs6000.md (lfiwax): Update clobbers that don't use
direct move to use wi and not wj.
(lfiwzx): Likewise.
(floatsi<mode>2_lfiwax_mem): Combine alternatives into a single
alternative.
(floatunssi<mode>2_lfiwzx_mem): Likewise.
(fix_trunc<mode>di2_fctidz): Change second alternative to allow
any VSX register, instead of just Altivec registers, to allow
either operand to be an Altivec register or both.
(fixuns_trunc<mode>di2_fctiduz): Likewise.
(movdi_internal32): Add support for -mupper-regs-di. Add support
to load constants via XXSPLTIB or VSPLTISW. Add spacing to allow
the alternatives and attributes to be lined up to be easier to
read.
(movdi_internal64): Likewise.
(64-bit DImode splitters): Change predicates to only split loading
up GPR registers. Add splits for using XXSPLTIB or VSPLTISW to
load constants in ISA 3.0 or ISA 2.07 respectively.
* doc/invoke.texi (RS/6000 and PowerPC Options): Document
-mupper-regs-di. Update -mupper-regs-df and -mupper-regs-sf to
mention -mcpu=power9 sets these options.
* doc/md.texi (PowerPC and IBM RS6000 constraints): Document the
wB constraint.
2016-06-15 Pitchumani Sivanupandi <pitchumani.s@atmel.com>
PR target/67353

View File

@ -77,8 +77,6 @@
(define_register_constraint "wh" "rs6000_constraints[RS6000_CONSTRAINT_wh]"
"Floating point register if direct moves are available, or NO_REGS.")
;; At present, DImode is not allowed in the Altivec registers. If in the
;; future it is allowed, wi/wj can be set to VSX_REGS instead of FLOAT_REGS.
(define_register_constraint "wi" "rs6000_constraints[RS6000_CONSTRAINT_wi]"
"FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS.")
@ -135,6 +133,13 @@
(define_register_constraint "wz" "rs6000_constraints[RS6000_CONSTRAINT_wz]"
"Floating point register if the LFIWZX instruction is enabled or NO_REGS.")
;; wB needs ISA 2.07 VUPKHSW
(define_constraint "wB"
"Signed 5-bit constant integer that can be loaded into an altivec register."
(and (match_code "const_int")
(and (match_test "TARGET_P8_VECTOR")
(match_operand 0 "s5bit_cint_operand"))))
(define_constraint "wD"
"Int constant that is the element number of the 64-bit scalar in a vector."
(and (match_code "const_int")

View File

@ -565,9 +565,8 @@
}
})
;; Return 1 if the operand is a CONST_VECTOR or VEC_DUPLICATE of a constant
;; that can loaded with a XXSPLTIB instruction and then a VUPKHSB, VECSB2W or
;; VECSB2D instruction.
;; Return 1 if the operand is a constant that can loaded with a XXSPLTIB
;; instruction and then a VUPKHSB, VECSB2W or VECSB2D instruction.
(define_predicate "xxspltib_constant_split"
(match_code "const_vector,vec_duplicate,const_int")
@ -582,8 +581,8 @@
})
;; Return 1 if the operand is a CONST_VECTOR that can loaded directly with a
;; XXSPLTIB instruction.
;; Return 1 if the operand is constant that can loaded directly with a XXSPLTIB
;; instruction.
(define_predicate "xxspltib_constant_nosplit"
(match_code "const_vector,vec_duplicate,const_int")

View File

@ -45,6 +45,7 @@
| OPTION_MASK_POPCNTD \
| OPTION_MASK_ALTIVEC \
| OPTION_MASK_VSX \
| OPTION_MASK_UPPER_REGS_DI \
| OPTION_MASK_UPPER_REGS_DF)
/* For now, don't provide an embedded version of ISA 2.07. */
@ -119,6 +120,7 @@
| OPTION_MASK_SOFT_FLOAT \
| OPTION_MASK_STRICT_ALIGN_OPTIONAL \
| OPTION_MASK_TOC_FUSION \
| OPTION_MASK_UPPER_REGS_DI \
| OPTION_MASK_UPPER_REGS_DF \
| OPTION_MASK_UPPER_REGS_SF \
| OPTION_MASK_VSX \
@ -211,7 +213,8 @@ RS6000_CPU ("power6x", PROCESSOR_POWER6, MASK_POWERPC64 | MASK_PPC_GPOPT
RS6000_CPU ("power7", PROCESSOR_POWER7, /* Don't add MASK_ISEL by default */
POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF
| MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD
| MASK_VSX | MASK_RECIP_PRECISION | OPTION_MASK_UPPER_REGS_DF)
| MASK_VSX | MASK_RECIP_PRECISION | OPTION_MASK_UPPER_REGS_DF
| OPTION_MASK_UPPER_REGS_DI)
RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | ISA_2_7_MASKS_SERVER)
RS6000_CPU ("power9", PROCESSOR_POWER9, MASK_POWERPC64 | ISA_3_0_MASKS_SERVER)
RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0)

View File

@ -1938,7 +1938,8 @@ rs6000_hard_regno_mode_ok (int regno, machine_mode mode)
|| FLOAT128_VECTOR_P (mode)
|| reg_addr[mode].scalar_in_vmx_p
|| (TARGET_VSX_TIMODE && mode == TImode)
|| (TARGET_VADDUQM && mode == V1TImode)))
|| (TARGET_VADDUQM && mode == V1TImode)
|| (TARGET_UPPER_REGS_DI && mode == DImode)))
{
if (FP_REGNO_P (regno))
return FP_REGNO_P (last_regno);
@ -3082,7 +3083,6 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
rs6000_constraints[RS6000_CONSTRAINT_wa] = VSX_REGS;
rs6000_constraints[RS6000_CONSTRAINT_wd] = VSX_REGS; /* V2DFmode */
rs6000_constraints[RS6000_CONSTRAINT_wf] = VSX_REGS; /* V4SFmode */
rs6000_constraints[RS6000_CONSTRAINT_wi] = FLOAT_REGS; /* DImode */
if (TARGET_VSX_TIMODE)
rs6000_constraints[RS6000_CONSTRAINT_wt] = VSX_REGS; /* TImode */
@ -3094,6 +3094,11 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
}
else
rs6000_constraints[RS6000_CONSTRAINT_ws] = FLOAT_REGS;
if (TARGET_UPPER_REGS_DF) /* DImode */
rs6000_constraints[RS6000_CONSTRAINT_wi] = VSX_REGS;
else
rs6000_constraints[RS6000_CONSTRAINT_wi] = FLOAT_REGS;
}
/* Add conditional constraints based on various options, to allow us to
@ -3306,6 +3311,9 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
if (TARGET_UPPER_REGS_DF)
reg_addr[DFmode].scalar_in_vmx_p = true;
if (TARGET_UPPER_REGS_DI)
reg_addr[DImode].scalar_in_vmx_p = true;
if (TARGET_UPPER_REGS_SF)
reg_addr[SFmode].scalar_in_vmx_p = true;
}
@ -4085,9 +4093,9 @@ rs6000_option_override_internal (bool global_init_p)
rs6000_isa_flags &= ~OPTION_MASK_DFP;
}
/* Allow an explicit -mupper-regs to set both -mupper-regs-df and
-mupper-regs-sf, depending on the cpu, unless the user explicitly also set
the individual option. */
/* Allow an explicit -mupper-regs to set -mupper-regs-df, -mupper-regs-di,
and -mupper-regs-sf, depending on the cpu, unless the user explicitly also
set the individual option. */
if (TARGET_UPPER_REGS > 0)
{
if (TARGET_VSX
@ -4096,6 +4104,12 @@ rs6000_option_override_internal (bool global_init_p)
rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_DF;
rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF;
}
if (TARGET_VSX
&& !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DI))
{
rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_DI;
rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DI;
}
if (TARGET_P8_VECTOR
&& !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF))
{
@ -4111,6 +4125,12 @@ rs6000_option_override_internal (bool global_init_p)
rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF;
rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF;
}
if (TARGET_VSX
&& !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DI))
{
rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DI;
rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DI;
}
if (TARGET_P8_VECTOR
&& !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF))
{
@ -4126,6 +4146,13 @@ rs6000_option_override_internal (bool global_init_p)
rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF;
}
if (TARGET_UPPER_REGS_DI && !TARGET_VSX)
{
if (rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DF)
error ("-mupper-regs-di requires -mvsx");
rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF;
}
if (TARGET_UPPER_REGS_SF && !TARGET_P8_VECTOR)
{
if (rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF)
@ -4386,6 +4413,7 @@ rs6000_option_override_internal (bool global_init_p)
if (TARGET_FLOAT128_HW
&& (rs6000_isa_flags & (OPTION_MASK_P9_VECTOR
| OPTION_MASK_DIRECT_MOVE
| OPTION_MASK_UPPER_REGS_DI
| OPTION_MASK_UPPER_REGS_DF
| OPTION_MASK_UPPER_REGS_SF)) == 0)
{
@ -6284,7 +6312,7 @@ xxspltib_constant_p (rtx op,
if (mode == VOIDmode)
mode = GET_MODE (op);
else if (mode != GET_MODE (op))
else if (mode != GET_MODE (op) && GET_MODE (op) != VOIDmode)
return false;
/* Handle (vec_duplicate <constant>). */
@ -6337,8 +6365,8 @@ xxspltib_constant_p (rtx op,
}
/* Handle integer constants being loaded into the upper part of the VSX
register as a scalar. If the value isn't 0/-1, only allow it if
the mode can go in Altivec registers. */
register as a scalar. If the value isn't 0/-1, only allow it if the mode
can go in Altivec registers. Prefer VSPLTISW/VUPKHSW over XXSPLITIB. */
else if (CONST_INT_P (op))
{
if (!SCALAR_INT_MODE_P (mode))
@ -6348,9 +6376,14 @@ xxspltib_constant_p (rtx op,
if (!IN_RANGE (value, -128, 127))
return false;
if (!IN_RANGE (value, -1, 0)
&& (reg_addr[mode].addr_mask[RELOAD_REG_VMX] & RELOAD_REG_VALID) == 0)
return false;
if (!IN_RANGE (value, -1, 0))
{
if (!(reg_addr[mode].addr_mask[RELOAD_REG_VMX] & RELOAD_REG_VALID))
return false;
if (EASY_VECTOR_15 (value))
return false;
}
}
else
@ -35485,6 +35518,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
{ "string", OPTION_MASK_STRING, false, true },
{ "toc-fusion", OPTION_MASK_TOC_FUSION, false, true },
{ "update", OPTION_MASK_NO_UPDATE, true , true },
{ "upper-regs-di", OPTION_MASK_UPPER_REGS_DI, false, true },
{ "upper-regs-df", OPTION_MASK_UPPER_REGS_DF, false, true },
{ "upper-regs-sf", OPTION_MASK_UPPER_REGS_SF, false, true },
{ "vsx", OPTION_MASK_VSX, false, true },

View File

@ -4866,7 +4866,7 @@
(define_insn_and_split "floatsi<mode>2_lfiwax"
[(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Fv>")
(float:SFDF (match_operand:SI 1 "nonimmediate_operand" "r")))
(clobber (match_scratch:DI 2 "=wj"))]
(clobber (match_scratch:DI 2 "=wi"))]
"TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWAX
&& <SI_CONVERT_FP> && can_create_pseudo_p ()"
"#"
@ -4905,11 +4905,11 @@
(set_attr "type" "fpload")])
(define_insn_and_split "floatsi<mode>2_lfiwax_mem"
[(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fa>")
[(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Fv>")
(float:SFDF
(sign_extend:DI
(match_operand:SI 1 "indexed_or_indirect_operand" "Z,Z"))))
(clobber (match_scratch:DI 2 "=0,d"))]
(match_operand:SI 1 "indexed_or_indirect_operand" "Z"))))
(clobber (match_scratch:DI 2 "=wi"))]
"TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWAX
&& <SI_CONVERT_FP>"
"#"
@ -4941,7 +4941,7 @@
(define_insn_and_split "floatunssi<mode>2_lfiwzx"
[(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Fv>")
(unsigned_float:SFDF (match_operand:SI 1 "nonimmediate_operand" "r")))
(clobber (match_scratch:DI 2 "=wj"))]
(clobber (match_scratch:DI 2 "=wi"))]
"TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWZX
&& <SI_CONVERT_FP>"
"#"
@ -4980,11 +4980,11 @@
(set_attr "type" "fpload")])
(define_insn_and_split "floatunssi<mode>2_lfiwzx_mem"
[(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fa>")
[(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Fv>")
(unsigned_float:SFDF
(zero_extend:DI
(match_operand:SI 1 "indexed_or_indirect_operand" "Z,Z"))))
(clobber (match_scratch:DI 2 "=0,d"))]
(match_operand:SI 1 "indexed_or_indirect_operand" "Z"))))
(clobber (match_scratch:DI 2 "=wi"))]
"TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWZX
&& <SI_CONVERT_FP>"
"#"
@ -5288,7 +5288,7 @@
(define_insn "*fix_trunc<mode>di2_fctidz"
[(set (match_operand:DI 0 "gpc_reg_operand" "=d,wi")
(fix:DI (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fa>")))]
(fix:DI (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")))]
"TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS
&& TARGET_FCFID"
"@
@ -5360,7 +5360,7 @@
(define_insn "*fixuns_trunc<mode>di2_fctiduz"
[(set (match_operand:DI 0 "gpc_reg_operand" "=d,wi")
(unsigned_fix:DI (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fa>")))]
(unsigned_fix:DI (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")))]
"TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS
&& TARGET_FCTIDUZ"
"@
@ -7700,9 +7700,25 @@
;; non-offsettable address by using r->r which won't make progress.
;; Use of fprs is disparaged slightly otherwise reload prefers to reload
;; a gpr into a fpr instead of reloading an invalid 'Y' address
;; GPR store GPR load GPR move FPR store FPR load FPR move
;; GPR const AVX store AVX store AVX load AVX load VSX move
;; P9 0 P9 -1 AVX 0/-1 VSX 0 VSX -1 P9 const
;; AVX const
(define_insn "*movdi_internal32"
[(set (match_operand:DI 0 "rs6000_nonimmediate_operand" "=Y,r,r,?m,?*d,?*d,r")
(match_operand:DI 1 "input_operand" "r,Y,r,d,m,d,IJKnGHF"))]
[(set (match_operand:DI 0 "rs6000_nonimmediate_operand"
"=Y, r, r, ?m, ?*d, ?*d,
r, ?Y, ?Z, ?*wb, ?*wv, ?wi,
?wo, ?wo, ?wv, ?wi, ?wi, ?wv,
?wv")
(match_operand:DI 1 "input_operand"
"r, Y, r, d, m, d,
IJKnGHF, wb, wv, Y, Z, wi,
Oj, wM, OjwM, Oj, wM, wS,
wB"))]
"! TARGET_POWERPC64
&& (gpc_reg_operand (operands[0], DImode)
|| gpc_reg_operand (operands[1], DImode))"
@ -7713,8 +7729,24 @@
stfd%U0%X0 %1,%0
lfd%U1%X1 %0,%1
fmr %0,%1
#
stxsd %1,%0
stxsdx %x1,%y0
lxsd %0,%1
lxsdx %x0,%y1
xxlor %x0,%x1,%x1
xxspltib %x0,0
xxspltib %x0,255
vspltisw %0,%1
xxlxor %x0,%x0,%x0
xxlorc %x0,%x0,%x0
#
#"
[(set_attr "type" "store,load,*,fpstore,fpload,fp,*")])
[(set_attr "type"
"store, load, *, fpstore, fpload, fp,
*, fpstore, fpstore, fpload, fpload, vecsimple,
vecsimple, vecsimple, vecsimple, vecsimple, vecsimple, vecsimple,
vecsimple")])
(define_split
[(set (match_operand:DI 0 "gpc_reg_operand" "")
@ -7744,9 +7776,26 @@
[(pc)]
{ rs6000_split_multireg_move (operands[0], operands[1]); DONE; })
;; GPR store GPR load GPR move GPR li GPR lis GPR #
;; FPR store FPR load FPR move AVX store AVX store AVX load
;; AVX load VSX move P9 0 P9 -1 AVX 0/-1 VSX 0
;; VSX -1 P9 const AVX const From SPR To SPR SPR<->SPR
;; FPR->GPR GPR->FPR VSX->GPR GPR->VSX
(define_insn "*movdi_internal64"
[(set (match_operand:DI 0 "nonimmediate_operand" "=Y,r,r,r,r,r,?m,?*d,?*d,r,*h,*h,r,?*wg,r,?*wj,?*wi")
(match_operand:DI 1 "input_operand" "r,Y,r,I,L,nF,d,m,d,*h,r,0,*wg,r,*wj,r,O"))]
[(set (match_operand:DI 0 "nonimmediate_operand"
"=Y, r, r, r, r, r,
?m, ?*d, ?*d, ?Y, ?Z, ?*wb,
?*wv, ?wi, ?wo, ?wo, ?wv, ?wi,
?wi, ?wv, ?wv, r, *h, *h,
?*r, ?*wg, ?*r, ?*wj")
(match_operand:DI 1 "input_operand"
"r, Y, r, I, L, nF,
d, m, d, wb, wv, Y,
Z, wi, Oj, wM, OjwM, Oj,
wM, wS, wB, *h, r, 0,
wg, r, wj, r"))]
"TARGET_POWERPC64
&& (gpc_reg_operand (operands[0], DImode)
|| gpc_reg_operand (operands[1], DImode))"
@ -7760,21 +7809,43 @@
stfd%U0%X0 %1,%0
lfd%U1%X1 %0,%1
fmr %0,%1
stxsd %1,%0
stxsdx %x1,%y0
lxsd %0,%1
lxsdx %x0,%y1
xxlor %x0,%x1,%x1
xxspltib %x0,0
xxspltib %x0,255
vspltisw %0,%1
xxlxor %x0,%x0,%x0
xxlorc %x0,%x0,%x0
#
#
mf%1 %0
mt%0 %1
nop
mftgpr %0,%1
mffgpr %0,%1
mfvsrd %0,%x1
mtvsrd %x0,%1
xxlxor %x0,%x0,%x0"
[(set_attr "type" "store,load,*,*,*,*,fpstore,fpload,fp,mfjmpr,mtjmpr,*,mftgpr,mffgpr,mftgpr,mffgpr,vecsimple")
(set_attr "length" "4,4,4,4,4,20,4,4,4,4,4,4,4,4,4,4,4")])
mtvsrd %x0,%1"
[(set_attr "type"
"store, load, *, *, *, *,
fpstore, fpload, fp, fpstore, fpstore, fpload,
fpload, vecsimple, vecsimple, vecsimple, vecsimple, vecsimple,
vecsimple, vecsimple, vecsimple, mfjmpr, mtjmpr, *,
mftgpr, mffgpr, mftgpr, mffgpr")
(set_attr "length"
"4, 4, 4, 4, 4, 20,
4, 4, 4, 4, 4, 4,
4, 4, 4, 4, 4, 8,
8, 4, 4, 4, 4, 4,
4, 4, 4, 4")])
; Some DImode loads are best done as a load of -1 followed by a mask
; instruction.
(define_split
[(set (match_operand:DI 0 "gpc_reg_operand")
[(set (match_operand:DI 0 "int_reg_operand_not_pseudo")
(match_operand:DI 1 "const_int_operand"))]
"TARGET_POWERPC64
&& num_insns_constant (operands[1], DImode) > 1
@ -7791,7 +7862,7 @@
;; When non-easy constants can go in the TOC, this should use
;; easy_fp_constant predicate.
(define_split
[(set (match_operand:DI 0 "gpc_reg_operand" "")
[(set (match_operand:DI 0 "int_reg_operand_not_pseudo" "")
(match_operand:DI 1 "const_int_operand" ""))]
"TARGET_POWERPC64 && num_insns_constant (operands[1], DImode) > 1"
[(set (match_dup 0) (match_dup 2))
@ -7805,7 +7876,7 @@
}")
(define_split
[(set (match_operand:DI 0 "gpc_reg_operand" "")
[(set (match_operand:DI 0 "int_reg_operand_not_pseudo" "")
(match_operand:DI 1 "const_scalar_int_operand" ""))]
"TARGET_POWERPC64 && num_insns_constant (operands[1], DImode) > 1"
[(set (match_dup 0) (match_dup 2))
@ -7817,6 +7888,43 @@
else
FAIL;
}")
(define_split
[(set (match_operand:DI 0 "altivec_register_operand" "")
(match_operand:DI 1 "s5bit_cint_operand" ""))]
"TARGET_UPPER_REGS_DI && TARGET_VSX && reload_completed"
[(const_int 0)]
{
rtx op0 = operands[0];
rtx op1 = operands[1];
int r = REGNO (op0);
rtx op0_v4si = gen_rtx_REG (V4SImode, r);
emit_insn (gen_altivec_vspltisw (op0_v4si, op1));
if (op1 != const0_rtx && op1 != constm1_rtx)
{
rtx op0_v2di = gen_rtx_REG (V2DImode, r);
emit_insn (gen_altivec_vupkhsw (op0_v2di, op0_v4si));
}
DONE;
})
(define_split
[(set (match_operand:DI 0 "altivec_register_operand" "")
(match_operand:DI 1 "xxspltib_constant_split" ""))]
"TARGET_UPPER_REGS_DI && TARGET_P9_VECTOR && reload_completed"
[(const_int 0)]
{
rtx op0 = operands[0];
rtx op1 = operands[1];
int r = REGNO (op0);
rtx op0_v16qi = gen_rtx_REG (V16QImode, r);
emit_insn (gen_xxspltib_v16qi (op0_v16qi, op1));
emit_insn (gen_vsx_sign_extend_qi_di (operands[0], op0_v16qi));
DONE;
})
;; TImode/PTImode is similar, except that we usually want to compute the
;; address into a register and use lsi/stsi (the exception is during reload).

View File

@ -597,6 +597,10 @@ mupper-regs
Target Report Var(TARGET_UPPER_REGS) Init(-1) Save
Allow float/double variables in upper registers if cpu allows it.
mupper-regs-di
Target Report Mask(UPPER_REGS_DI) Var(rs6000_isa_flags)
Allow 64-bit integer variables in upper registers with -mcpu=power7 or -mvsx.
moptimize-swaps
Target Undocumented Var(rs6000_optimize_swaps) Init(1) Save
Analyze and remove doubleword swaps from VSX computations.

View File

@ -260,7 +260,7 @@
(V2DI "wi")])
;; Iterators for loading constants with xxspltib
(define_mode_iterator VSINT_84 [V4SI V2DI])
(define_mode_iterator VSINT_84 [V4SI V2DI DI])
(define_mode_iterator VSINT_842 [V8HI V4SI V2DI])
;; Constants for creating unspecs
@ -2095,77 +2095,69 @@
[(set_attr "type" "vecperm")])
;; Extract a DF/DI element from V2DF/V2DI
(define_expand "vsx_extract_<mode>"
[(set (match_operand:<VS_scalar> 0 "register_operand" "")
(vec_select:<VS_scalar> (match_operand:VSX_D 1 "register_operand" "")
(parallel
[(match_operand:QI 2 "u5bit_cint_operand" "")])))]
"VECTOR_MEM_VSX_P (<MODE>mode)"
"")
;; Optimize cases were we can do a simple or direct move.
;; Or see if we can avoid doing the move at all
(define_insn "*vsx_extract_<mode>_internal1"
[(set (match_operand:<VS_scalar> 0 "register_operand" "=d,<VS_64reg>,r,r")
;; There are some unresolved problems with reload that show up if an Altivec
;; register was picked. Limit the scalar value to FPRs for now.
(define_insn "vsx_extract_<mode>"
[(set (match_operand:<VS_scalar> 0 "gpc_reg_operand"
"=d, wm, wo, d")
(vec_select:<VS_scalar>
(match_operand:VSX_D 1 "register_operand" "d,<VS_64reg>,<VS_64dm>,<VS_64dm>")
(match_operand:VSX_D 1 "gpc_reg_operand"
"<VSa>, <VSa>, <VSa>, <VSa>")
(parallel
[(match_operand:QI 2 "vsx_scalar_64bit" "wD,wD,wD,wL")])))]
"VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
[(match_operand:QI 2 "const_0_to_1_operand"
"wD, wD, wL, n")])))]
"VECTOR_MEM_VSX_P (<MODE>mode)"
{
int element = INTVAL (operands[2]);
int op0_regno = REGNO (operands[0]);
int op1_regno = REGNO (operands[1]);
if (op0_regno == op1_regno)
return "nop";
if (INT_REGNO_P (op0_regno))
return ((INTVAL (operands[2]) == VECTOR_ELEMENT_MFVSRLD_64BIT)
? "mfvsrdl %0,%x1"
: "mfvsrd %0,%x1");
if (FP_REGNO_P (op0_regno) && FP_REGNO_P (op1_regno))
return "fmr %0,%1";
return "xxlor %x0,%x1,%x1";
}
[(set_attr "type" "fp,vecsimple,mftgpr,mftgpr")
(set_attr "length" "4")])
(define_insn "*vsx_extract_<mode>_internal2"
[(set (match_operand:<VS_scalar> 0 "vsx_register_operand" "=d,<VS_64reg>,<VS_64reg>")
(vec_select:<VS_scalar>
(match_operand:VSX_D 1 "vsx_register_operand" "d,wd,wd")
(parallel [(match_operand:QI 2 "u5bit_cint_operand" "wD,wD,i")])))]
"VECTOR_MEM_VSX_P (<MODE>mode)
&& (!TARGET_POWERPC64 || !TARGET_DIRECT_MOVE
|| INTVAL (operands[2]) != VECTOR_ELEMENT_SCALAR_64BIT)"
{
int fldDM;
gcc_assert (UINTVAL (operands[2]) <= 1);
if (INTVAL (operands[2]) == VECTOR_ELEMENT_SCALAR_64BIT)
gcc_assert (IN_RANGE (element, 0, 1));
gcc_assert (VSX_REGNO_P (op1_regno));
if (element == VECTOR_ELEMENT_SCALAR_64BIT)
{
int op0_regno = REGNO (operands[0]);
int op1_regno = REGNO (operands[1]);
if (op0_regno == op1_regno)
return "nop";
return ASM_COMMENT_START " vec_extract to same register";
if (FP_REGNO_P (op0_regno) && FP_REGNO_P (op1_regno))
else if (INT_REGNO_P (op0_regno) && TARGET_DIRECT_MOVE
&& TARGET_POWERPC64)
return "mfvsrd %0,%x1";
else if (FP_REGNO_P (op0_regno) && FP_REGNO_P (op1_regno))
return "fmr %0,%1";
return "xxlor %x0,%x1,%x1";
else if (VSX_REGNO_P (op0_regno))
return "xxlor %x0,%x1,%x1";
else
gcc_unreachable ();
}
fldDM = INTVAL (operands[2]) << 1;
if (!BYTES_BIG_ENDIAN)
fldDM = 3 - fldDM;
operands[3] = GEN_INT (fldDM);
return "xxpermdi %x0,%x1,%x1,%3";
else if (element == VECTOR_ELEMENT_MFVSRLD_64BIT && INT_REGNO_P (op0_regno)
&& TARGET_P9_VECTOR && TARGET_POWERPC64 && TARGET_DIRECT_MOVE)
return "mfvsrdl %0,%x1";
else if (VSX_REGNO_P (op0_regno))
{
fldDM = element << 1;
if (!BYTES_BIG_ENDIAN)
fldDM = 3 - fldDM;
operands[3] = GEN_INT (fldDM);
return "xxpermdi %x0,%x1,%x1,%3";
}
else
gcc_unreachable ();
}
[(set_attr "type" "fp,vecsimple,vecperm")
(set_attr "length" "4")])
[(set_attr "type" "vecsimple,mftgpr,mftgpr,vecperm")])
;; Optimize extracting a single scalar element from memory if the scalar is in
;; the correct location to use a single load.

View File

@ -1009,6 +1009,7 @@ See RS/6000 and PowerPC Options.
-mquad-memory-atomic -mno-quad-memory-atomic @gol
-mcompat-align-parm -mno-compat-align-parm @gol
-mupper-regs-df -mno-upper-regs-df -mupper-regs-sf -mno-upper-regs-sf @gol
-mupper-regs-di -mno-upper-regs-di @gol
-mupper-regs -mno-upper-regs -mmodulo -mno-modulo @gol
-mfloat128 -mno-float128 -mfloat128-hardware -mno-float128-hardware @gol
-mpower9-fusion -mno-mpower9-fusion -mpower9-vector -mno-power9-vector @gol
@ -20255,6 +20256,17 @@ Generate code that uses (does not use) the atomic quad word memory
instructions. The @option{-mquad-memory-atomic} option requires use of
64-bit mode.
@item -mupper-regs-di
@itemx -mno-upper-regs-di
@opindex mupper-regs-di
@opindex mno-upper-regs-di
Generate code that uses (does not use) the scalar instructions that
target all 64 registers in the vector/scalar floating point register
set that were added in version 2.06 of the PowerPC ISA when processing
integers. @option{-mupper-regs-di} is turned on by default if you use
any of the @option{-mcpu=power7}, @option{-mcpu=power8},
@option{-mcpu=power9}, or @option{-mvsx} options.
@item -mupper-regs-df
@itemx -mno-upper-regs-df
@opindex mupper-regs-df
@ -20263,8 +20275,8 @@ Generate code that uses (does not use) the scalar double precision
instructions that target all 64 registers in the vector/scalar
floating point register set that were added in version 2.06 of the
PowerPC ISA. @option{-mupper-regs-df} is turned on by default if you
use any of the @option{-mcpu=power7}, @option{-mcpu=power8}, or
@option{-mvsx} options.
use any of the @option{-mcpu=power7}, @option{-mcpu=power8},
@option{-mcpu=power9}, or @option{-mvsx} options.
@item -mupper-regs-sf
@itemx -mno-upper-regs-sf
@ -20274,8 +20286,8 @@ Generate code that uses (does not use) the scalar single precision
instructions that target all 64 registers in the vector/scalar
floating point register set that were added in version 2.07 of the
PowerPC ISA. @option{-mupper-regs-sf} is turned on by default if you
use either of the @option{-mcpu=power8} or @option{-mpower8-vector}
options.
use either of the @option{-mcpu=power8}, @option{-mpower8-vector}, or
@option{-mpower9} options.
@item -mupper-regs
@itemx -mno-upper-regs

View File

@ -3211,6 +3211,9 @@ FP or VSX register to perform ISA 2.07 float ops or NO_REGS.
@item wz
Floating point register if the LFIWZX instruction is enabled or NO_REGS.
@item wB
Signed 5-bit constant integer that can be loaded into an altivec register.
@item wD
Int constant that is the element number of the 64-bit scalar in a vector.

View File

@ -1,3 +1,8 @@
2016-06-15 Michael Meissner <meissner@linux.vnet.ibm.com>
* gcc.target/powerpc/p9-dimode1.c: New test.
* gcc.target/powerpc/p9-dimode2.c: Likewise.
2016-06-15 Jakub Jelinek <jakub@redhat.com>
* gcc.c-torture/compile/20160615-1.c: New test.

View File

@ -0,0 +1,50 @@
/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
/* { dg-require-effective-target powerpc_p9vector_ok } */
/* { dg-options "-mcpu=power9 -O2 -mupper-regs-di" } */
/* Verify P9 changes to allow DImode into Altivec registers, and generate
constants using XXSPLTIB. */
#ifndef _ARCH_PPC64
#error "This code is 64-bit."
#endif
double
p9_zero (void)
{
long l = 0;
double ret;
__asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l));
return ret;
}
double
p9_plus_1 (void)
{
long l = 1;
double ret;
__asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l));
return ret;
}
double
p9_minus_1 (void)
{
long l = -1;
double ret;
__asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l));
return ret;
}
/* { dg-final { scan-assembler "xxspltib" } } */
/* { dg-final { scan-assembler-not "mtvsrd" } } */
/* { dg-final { scan-assembler-not "lfd" } } */
/* { dg-final { scan-assembler-not "ld" } } */
/* { dg-final { scan-assembler-not "lxsd" } } */

View File

@ -0,0 +1,27 @@
/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
/* { dg-require-effective-target powerpc_p9vector_ok } */
/* { dg-options "-mcpu=power9 -O2 -mupper-regs-di" } */
/* Verify that large integer constants are loaded via direct move instead of being
loaded from memory. */
#ifndef _ARCH_PPC64
#error "This code is 64-bit."
#endif
double
p9_large (void)
{
long l = 0x12345678;
double ret;
__asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l));
return ret;
}
/* { dg-final { scan-assembler "mtvsrd" } } */
/* { dg-final { scan-assembler-not "ld" } } */
/* { dg-final { scan-assembler-not "lfd" } } */
/* { dg-final { scan-assembler-not "lxsd" } } */