vsx.md (VSINT_84): Add DImode to enable loading DImode constants with XXSPLTIB in vector registers.

[gcc] 2016-06-15 Michael Meissner <meissner@linux.vnet.ibm.com> * config/rs6000/vsx.md (VSINT_84): Add DImode to enable loading DImode constants with XXSPLTIB in vector registers. (vsx_extract_<mode>, V2DImode/V2DFmode): Combine both vsx_extract_<mode>_internal{1,2} into a single insn that handles direct move (both ISA 2.07 and ISA 3.0 versions), and optimizes extraction of the element at the top of the register as a scalar value. (vsx_extract_<mode>_internal1): Likewise. (vsx_extract_<mode>_internal2): Likewise. * config/rs6000/constraints.md (wi constraint): Remove a comment about DImode not being allowed in Altivec registers. (wB constraint): New constraint for constants that can be generated in Altivec registers with VSPLTISW/VUPKHSW. * config/rs6000/predicates.md (xxspltib_constant_split): Update comments. (xxspltib_constant_nosplit): Likewise. * config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Add support for -mupper-regs-di to enable DImode to go into Altivec registers. (POWERPC_MASKS): Likewise. (power7 cpu): Likewise. * config/rs6000/rs6000.opt (-mupper-regs-di): Likewise. * config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Add support for DImode being allowed in Altivec registers. Update wi/wj constraints. Set scalar_in_vmx_p flag. (rs6000_option_override_internal): Add checks for -mupper-regs-di. (xxspltib_constant_p): Allow CONST_INT's with VOIDmode. Don't return true if we could use VSPLTISW/VUPKHSW instead of XXSPLTIB. (rs6000_opt_masks): Add -mupper-regs-di. * config/rs6000/rs6000.md (lfiwax): Update clobbers that don't use direct move to use wi and not wj. (lfiwzx): Likewise. (floatsi<mode>2_lfiwax_mem): Combine alternatives into a single alternative. (floatunssi<mode>2_lfiwzx_mem): Likewise. (fix_trunc<mode>di2_fctidz): Change second alternative to allow any VSX register, instead of just Altivec registers, to allow either operand to be an Altivec register or both. (fixuns_trunc<mode>di2_fctiduz): Likewise. (movdi_internal32): Add support for -mupper-regs-di. Add support to load constants via XXSPLTIB or VSPLTISW. Add spacing to allow the alternatives and attributes to be lined up to be easier to read. (movdi_internal64): Likewise. (64-bit DImode splitters): Change predicates to only split loading up GPR registers. Add splits for using XXSPLTIB or VSPLTISW to load constants in ISA 3.0 or ISA 2.07 respectively. * doc/invoke.texi (RS/6000 and PowerPC Options): Document -mupper-regs-di. Update -mupper-regs-df and -mupper-regs-sf to mention -mcpu=power9 sets these options. * doc/md.texi (PowerPC and IBM RS6000 constraints): Document the wB constraint. [gcc/testsuite] 2016-06-15 Michael Meissner <meissner@linux.vnet.ibm.com> * gcc.target/powerpc/p9-dimode1.c: New test. * gcc.target/powerpc/p9-dimode2.c: Likewise. From-SVN: r237490
2016-06-15 18:17:58 +00:00 · 2016-06-15 18:17:58 +00:00 · 1a3c3ee9bc
parent 61daecc46b
commit 1a3c3ee9bc
13 changed files with 397 additions and 100 deletions
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@ -1,3 +1,58 @@
+2016-06-15  Michael Meissner  <meissner@linux.vnet.ibm.com>
+
+	* config/rs6000/vsx.md (VSINT_84): Add DImode to enable loading
+	DImode constants with XXSPLTIB in vector registers.
+	(vsx_extract_<mode>, V2DImode/V2DFmode): Combine both
+	vsx_extract_<mode>_internal{1,2} into a single insn that handles
+	direct move (both ISA 2.07 and ISA 3.0 versions), and optimizes
+	extraction of the element at the top of the register as a scalar
+	value.
+	(vsx_extract_<mode>_internal1): Likewise.
+	(vsx_extract_<mode>_internal2): Likewise.
+	* config/rs6000/constraints.md (wi constraint): Remove a comment
+	about DImode not being allowed in Altivec registers.
+	(wB constraint): New constraint for constants that can be
+	generated in Altivec registers with VSPLTISW/VUPKHSW.
+	* config/rs6000/predicates.md (xxspltib_constant_split): Update
+	comments.
+	(xxspltib_constant_nosplit): Likewise.
+	* config/rs6000/rs6000-cpus.def (ISA_2_6_MASKS_SERVER): Add
+	support for -mupper-regs-di to enable DImode to go into Altivec
+	registers.
+	(POWERPC_MASKS): Likewise.
+	(power7 cpu): Likewise.
+	* config/rs6000/rs6000.opt (-mupper-regs-di): Likewise.
+	* config/rs6000/rs6000.c (rs6000_hard_regno_mode_ok): Add support
+	for DImode being allowed in Altivec registers.  Update wi/wj
+	constraints.  Set scalar_in_vmx_p flag.
+	(rs6000_option_override_internal): Add checks for -mupper-regs-di.
+	(xxspltib_constant_p): Allow CONST_INT's with VOIDmode.  Don't
+	return true if we could use VSPLTISW/VUPKHSW instead of XXSPLTIB.
+	(rs6000_opt_masks): Add -mupper-regs-di.
+	* config/rs6000/rs6000.md (lfiwax): Update clobbers that don't use
+	direct move to use wi and not wj.
+	(lfiwzx): Likewise.
+	(floatsi<mode>2_lfiwax_mem): Combine alternatives into a single
+	alternative.
+	(floatunssi<mode>2_lfiwzx_mem): Likewise.
+	(fix_trunc<mode>di2_fctidz): Change second alternative to allow
+	any VSX register, instead of just Altivec registers, to allow
+	either operand to be an Altivec register or both.
+	(fixuns_trunc<mode>di2_fctiduz): Likewise.
+	(movdi_internal32): Add support for -mupper-regs-di.  Add support
+	to load constants via XXSPLTIB or VSPLTISW.  Add spacing to allow
+	the alternatives and attributes to be lined up to be easier to
+	read.
+	(movdi_internal64): Likewise.
+	(64-bit DImode splitters): Change predicates to only split loading
+	up GPR registers.  Add splits for using XXSPLTIB or VSPLTISW to
+	load constants in ISA 3.0 or ISA 2.07 respectively.
+	* doc/invoke.texi (RS/6000 and PowerPC Options): Document
+	-mupper-regs-di.  Update -mupper-regs-df and -mupper-regs-sf to
+	mention -mcpu=power9 sets these options.
+	* doc/md.texi (PowerPC and IBM RS6000 constraints): Document the
+	wB constraint.
+
 2016-06-15  Pitchumani Sivanupandi  <pitchumani.s@atmel.com>

 	PR target/67353
--- a/gcc/config/rs6000/constraints.md
+++ b/gcc/config/rs6000/constraints.md
@ -77,8 +77,6 @@
 (define_register_constraint "wh" "rs6000_constraints[RS6000_CONSTRAINT_wh]"
  "Floating point register if direct moves are available, or NO_REGS.")

-;; At present, DImode is not allowed in the Altivec registers.  If in the
-;; future it is allowed, wi/wj can be set to VSX_REGS instead of FLOAT_REGS.
 (define_register_constraint "wi" "rs6000_constraints[RS6000_CONSTRAINT_wi]"
  "FP or VSX register to hold 64-bit integers for VSX insns or NO_REGS.")

@ -135,6 +133,13 @@
 (define_register_constraint "wz" "rs6000_constraints[RS6000_CONSTRAINT_wz]"
  "Floating point register if the LFIWZX instruction is enabled or NO_REGS.")

+;; wB needs ISA 2.07 VUPKHSW
+(define_constraint "wB"
+  "Signed 5-bit constant integer that can be loaded into an altivec register."
+  (and (match_code "const_int")
+       (and (match_test "TARGET_P8_VECTOR")
+	    (match_operand 0 "s5bit_cint_operand"))))
+
 (define_constraint "wD"
  "Int constant that is the element number of the 64-bit scalar in a vector."
  (and (match_code "const_int")
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@ -565,9 +565,8 @@
    }
 })

-;; Return 1 if the operand is a CONST_VECTOR or VEC_DUPLICATE of a constant
-;; that can loaded with a XXSPLTIB instruction and then a VUPKHSB, VECSB2W or
-;; VECSB2D instruction.
+;; Return 1 if the operand is a constant that can loaded with a XXSPLTIB
+;; instruction and then a VUPKHSB, VECSB2W or VECSB2D instruction.

 (define_predicate "xxspltib_constant_split"
  (match_code "const_vector,vec_duplicate,const_int")
@ -582,8 +581,8 @@
 })


-;; Return 1 if the operand is a CONST_VECTOR that can loaded directly with a
-;; XXSPLTIB instruction.
+;; Return 1 if the operand is constant that can loaded directly with a XXSPLTIB
+;; instruction.

 (define_predicate "xxspltib_constant_nosplit"
  (match_code "const_vector,vec_duplicate,const_int")
--- a/gcc/config/rs6000/rs6000-cpus.def
+++ b/gcc/config/rs6000/rs6000-cpus.def
@ -45,6 +45,7 @@
 				 | OPTION_MASK_POPCNTD			\
 				 | OPTION_MASK_ALTIVEC			\
 				 | OPTION_MASK_VSX			\
+				 | OPTION_MASK_UPPER_REGS_DI		\
 				 | OPTION_MASK_UPPER_REGS_DF)

 /* For now, don't provide an embedded version of ISA 2.07.  */
@ -119,6 +120,7 @@
 				 | OPTION_MASK_SOFT_FLOAT		\
 				 | OPTION_MASK_STRICT_ALIGN_OPTIONAL	\
 				 | OPTION_MASK_TOC_FUSION		\
+				 | OPTION_MASK_UPPER_REGS_DI		\
 				 | OPTION_MASK_UPPER_REGS_DF		\
 				 | OPTION_MASK_UPPER_REGS_SF		\
 				 | OPTION_MASK_VSX			\
@ -211,7 +213,8 @@ RS6000_CPU ("power6x", PROCESSOR_POWER6, MASK_POWERPC64 | MASK_PPC_GPOPT
 RS6000_CPU ("power7", PROCESSOR_POWER7,   /* Don't add MASK_ISEL by default */
 	    POWERPC_7400_MASK | MASK_POWERPC64 | MASK_PPC_GPOPT | MASK_MFCRF
 	    | MASK_POPCNTB | MASK_FPRND | MASK_CMPB | MASK_DFP | MASK_POPCNTD
-	    | MASK_VSX | MASK_RECIP_PRECISION | OPTION_MASK_UPPER_REGS_DF)
+	    | MASK_VSX | MASK_RECIP_PRECISION | OPTION_MASK_UPPER_REGS_DF
+	    | OPTION_MASK_UPPER_REGS_DI)
 RS6000_CPU ("power8", PROCESSOR_POWER8, MASK_POWERPC64 | ISA_2_7_MASKS_SERVER)
 RS6000_CPU ("power9", PROCESSOR_POWER9, MASK_POWERPC64 | ISA_3_0_MASKS_SERVER)
 RS6000_CPU ("powerpc", PROCESSOR_POWERPC, 0)
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@ -1938,7 +1938,8 @@ rs6000_hard_regno_mode_ok (int regno, machine_mode mode)
 	  || FLOAT128_VECTOR_P (mode)
 	  || reg_addr[mode].scalar_in_vmx_p
 	  || (TARGET_VSX_TIMODE && mode == TImode)
-	  || (TARGET_VADDUQM && mode == V1TImode)))
+	  || (TARGET_VADDUQM && mode == V1TImode)
+	  || (TARGET_UPPER_REGS_DI && mode == DImode)))
    {
      if (FP_REGNO_P (regno))
 	return FP_REGNO_P (last_regno);
@ -3082,7 +3083,6 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
      rs6000_constraints[RS6000_CONSTRAINT_wa] = VSX_REGS;
      rs6000_constraints[RS6000_CONSTRAINT_wd] = VSX_REGS;	/* V2DFmode  */
      rs6000_constraints[RS6000_CONSTRAINT_wf] = VSX_REGS;	/* V4SFmode  */
-      rs6000_constraints[RS6000_CONSTRAINT_wi] = FLOAT_REGS;	/* DImode  */

      if (TARGET_VSX_TIMODE)
 	rs6000_constraints[RS6000_CONSTRAINT_wt] = VSX_REGS;	/* TImode  */
@ -3094,6 +3094,11 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
 	}
      else
 	rs6000_constraints[RS6000_CONSTRAINT_ws] = FLOAT_REGS;
+
+      if (TARGET_UPPER_REGS_DF)					/* DImode  */
+	rs6000_constraints[RS6000_CONSTRAINT_wi] = VSX_REGS;
+      else
+	rs6000_constraints[RS6000_CONSTRAINT_wi] = FLOAT_REGS;
    }

  /* Add conditional constraints based on various options, to allow us to
@ -3306,6 +3311,9 @@ rs6000_init_hard_regno_mode_ok (bool global_init_p)
      if (TARGET_UPPER_REGS_DF)
 	reg_addr[DFmode].scalar_in_vmx_p = true;

+      if (TARGET_UPPER_REGS_DI)
+	reg_addr[DImode].scalar_in_vmx_p = true;
+
      if (TARGET_UPPER_REGS_SF)
 	reg_addr[SFmode].scalar_in_vmx_p = true;
    }
@ -4085,9 +4093,9 @@ rs6000_option_override_internal (bool global_init_p)
      rs6000_isa_flags &= ~OPTION_MASK_DFP;
    }

-  /* Allow an explicit -mupper-regs to set both -mupper-regs-df and
-     -mupper-regs-sf, depending on the cpu, unless the user explicitly also set
-     the individual option.  */
+  /* Allow an explicit -mupper-regs to set -mupper-regs-df, -mupper-regs-di,
+     and -mupper-regs-sf, depending on the cpu, unless the user explicitly also
+     set the individual option.  */
  if (TARGET_UPPER_REGS > 0)
    {
      if (TARGET_VSX
@ -4096,6 +4104,12 @@ rs6000_option_override_internal (bool global_init_p)
 	  rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_DF;
 	  rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF;
 	}
+      if (TARGET_VSX
+	  && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DI))
+	{
+	  rs6000_isa_flags |= OPTION_MASK_UPPER_REGS_DI;
+	  rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DI;
+	}
      if (TARGET_P8_VECTOR
 	  && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF))
 	{
@ -4111,6 +4125,12 @@ rs6000_option_override_internal (bool global_init_p)
 	  rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF;
 	  rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DF;
 	}
+      if (TARGET_VSX
+	  && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DI))
+	{
+	  rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DI;
+	  rs6000_isa_flags_explicit |= OPTION_MASK_UPPER_REGS_DI;
+	}
      if (TARGET_P8_VECTOR
 	  && !(rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF))
 	{
@ -4126,6 +4146,13 @@ rs6000_option_override_internal (bool global_init_p)
      rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF;
    }

+  if (TARGET_UPPER_REGS_DI && !TARGET_VSX)
+    {
+      if (rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_DF)
+	error ("-mupper-regs-di requires -mvsx");
+      rs6000_isa_flags &= ~OPTION_MASK_UPPER_REGS_DF;
+    }
+
  if (TARGET_UPPER_REGS_SF && !TARGET_P8_VECTOR)
    {
      if (rs6000_isa_flags_explicit & OPTION_MASK_UPPER_REGS_SF)
@ -4386,6 +4413,7 @@ rs6000_option_override_internal (bool global_init_p)
  if (TARGET_FLOAT128_HW
      && (rs6000_isa_flags & (OPTION_MASK_P9_VECTOR
 			      | OPTION_MASK_DIRECT_MOVE
+			      | OPTION_MASK_UPPER_REGS_DI
 			      | OPTION_MASK_UPPER_REGS_DF
 			      | OPTION_MASK_UPPER_REGS_SF)) == 0)
    {
@ -6284,7 +6312,7 @@ xxspltib_constant_p (rtx op,
  if (mode == VOIDmode)
    mode = GET_MODE (op);

-  else if (mode != GET_MODE (op))
+  else if (mode != GET_MODE (op) && GET_MODE (op) != VOIDmode)
    return false;

  /* Handle (vec_duplicate <constant>).  */
@ -6337,8 +6365,8 @@ xxspltib_constant_p (rtx op,
    }

  /* Handle integer constants being loaded into the upper part of the VSX
-     register as a scalar.  If the value isn't 0/-1, only allow it if
-     the mode can go in Altivec registers.  */
+     register as a scalar.  If the value isn't 0/-1, only allow it if the mode
+     can go in Altivec registers.  Prefer VSPLTISW/VUPKHSW over XXSPLITIB.  */
  else if (CONST_INT_P (op))
    {
      if (!SCALAR_INT_MODE_P (mode))
@ -6348,9 +6376,14 @@ xxspltib_constant_p (rtx op,
      if (!IN_RANGE (value, -128, 127))
 	return false;

-      if (!IN_RANGE (value, -1, 0)
-	  && (reg_addr[mode].addr_mask[RELOAD_REG_VMX] & RELOAD_REG_VALID) == 0)
-	return false;
+      if (!IN_RANGE (value, -1, 0))
+	{
+	  if (!(reg_addr[mode].addr_mask[RELOAD_REG_VMX] & RELOAD_REG_VALID))
+	    return false;
+
+	  if (EASY_VECTOR_15 (value))
+	    return false;
+	}
    }

  else
@ -35485,6 +35518,7 @@ static struct rs6000_opt_mask const rs6000_opt_masks[] =
  { "string",			OPTION_MASK_STRING,		false, true  },
  { "toc-fusion",		OPTION_MASK_TOC_FUSION,		false, true  },
  { "update",			OPTION_MASK_NO_UPDATE,		true , true  },
+  { "upper-regs-di",		OPTION_MASK_UPPER_REGS_DI,	false, true  },
  { "upper-regs-df",		OPTION_MASK_UPPER_REGS_DF,	false, true  },
  { "upper-regs-sf",		OPTION_MASK_UPPER_REGS_SF,	false, true  },
  { "vsx",			OPTION_MASK_VSX,		false, true  },
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@ -4866,7 +4866,7 @@
 (define_insn_and_split "floatsi<mode>2_lfiwax"
  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Fv>")
 	(float:SFDF (match_operand:SI 1 "nonimmediate_operand" "r")))
-   (clobber (match_scratch:DI 2 "=wj"))]
+   (clobber (match_scratch:DI 2 "=wi"))]
  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWAX
   && <SI_CONVERT_FP> && can_create_pseudo_p ()"
  "#"
@ -4905,11 +4905,11 @@
   (set_attr "type" "fpload")])

 (define_insn_and_split "floatsi<mode>2_lfiwax_mem"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fa>")
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Fv>")
 	(float:SFDF
 	 (sign_extend:DI
-	  (match_operand:SI 1 "indexed_or_indirect_operand" "Z,Z"))))
-   (clobber (match_scratch:DI 2 "=0,d"))]
+	  (match_operand:SI 1 "indexed_or_indirect_operand" "Z"))))
+   (clobber (match_scratch:DI 2 "=wi"))]
  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWAX
   && <SI_CONVERT_FP>"
  "#"
@ -4941,7 +4941,7 @@
 (define_insn_and_split "floatunssi<mode>2_lfiwzx"
  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Fv>")
 	(unsigned_float:SFDF (match_operand:SI 1 "nonimmediate_operand" "r")))
-   (clobber (match_scratch:DI 2 "=wj"))]
+   (clobber (match_scratch:DI 2 "=wi"))]
  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWZX
   && <SI_CONVERT_FP>"
  "#"
@ -4980,11 +4980,11 @@
   (set_attr "type" "fpload")])

 (define_insn_and_split "floatunssi<mode>2_lfiwzx_mem"
-  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Ff>,<Fa>")
+  [(set (match_operand:SFDF 0 "gpc_reg_operand" "=<Fv>")
 	(unsigned_float:SFDF
 	 (zero_extend:DI
-	  (match_operand:SI 1 "indexed_or_indirect_operand" "Z,Z"))))
-   (clobber (match_scratch:DI 2 "=0,d"))]
+	  (match_operand:SI 1 "indexed_or_indirect_operand" "Z"))))
+   (clobber (match_scratch:DI 2 "=wi"))]
  "TARGET_HARD_FLOAT && TARGET_FPRS && TARGET_DOUBLE_FLOAT && TARGET_LFIWZX
   && <SI_CONVERT_FP>"
  "#"
@ -5288,7 +5288,7 @@

 (define_insn "*fix_trunc<mode>di2_fctidz"
  [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wi")
-	(fix:DI (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fa>")))]
+	(fix:DI (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")))]
  "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS
    && TARGET_FCFID"
  "@
@ -5360,7 +5360,7 @@

 (define_insn "*fixuns_trunc<mode>di2_fctiduz"
  [(set (match_operand:DI 0 "gpc_reg_operand" "=d,wi")
-	(unsigned_fix:DI (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fa>")))]
+	(unsigned_fix:DI (match_operand:SFDF 1 "gpc_reg_operand" "<Ff>,<Fv>")))]
  "TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT && TARGET_FPRS
    && TARGET_FCTIDUZ"
  "@
@ -7700,9 +7700,25 @@
 ;; non-offsettable address by using r->r which won't make progress.
 ;; Use of fprs is disparaged slightly otherwise reload prefers to reload
 ;; a gpr into a fpr instead of reloading an invalid 'Y' address
+
+;;        GPR store  GPR load   GPR move   FPR store  FPR load    FPR move
+;;        GPR const  AVX store  AVX store  AVX load   AVX load    VSX move
+;;        P9 0       P9 -1      AVX 0/-1   VSX 0      VSX -1      P9 const
+;;        AVX const  
+
 (define_insn "*movdi_internal32"
-  [(set (match_operand:DI 0 "rs6000_nonimmediate_operand" "=Y,r,r,?m,?*d,?*d,r")
-	(match_operand:DI 1 "input_operand" "r,Y,r,d,m,d,IJKnGHF"))]
+  [(set (match_operand:DI 0 "rs6000_nonimmediate_operand"
+         "=Y,        r,         r,         ?m,        ?*d,        ?*d,
+          r,         ?Y,        ?Z,        ?*wb,      ?*wv,       ?wi,
+          ?wo,       ?wo,       ?wv,       ?wi,       ?wi,        ?wv,
+          ?wv")
+
+	(match_operand:DI 1 "input_operand"
+          "r,        Y,         r,         d,         m,          d,
+           IJKnGHF,  wb,        wv,        Y,         Z,          wi,
+           Oj,       wM,        OjwM,      Oj,        wM,         wS,
+           wB"))]
+
  "! TARGET_POWERPC64
   && (gpc_reg_operand (operands[0], DImode)
       || gpc_reg_operand (operands[1], DImode))"
@ -7713,8 +7729,24 @@
   stfd%U0%X0 %1,%0
   lfd%U1%X1 %0,%1
   fmr %0,%1
+   #
+   stxsd %1,%0
+   stxsdx %x1,%y0
+   lxsd %0,%1
+   lxsdx %x0,%y1
+   xxlor %x0,%x1,%x1
+   xxspltib %x0,0
+   xxspltib %x0,255
+   vspltisw %0,%1
+   xxlxor %x0,%x0,%x0
+   xxlorc %x0,%x0,%x0
+   #
   #"
-  [(set_attr "type" "store,load,*,fpstore,fpload,fp,*")])
+  [(set_attr "type"
+               "store,     load,      *,         fpstore,   fpload,     fp,
+                *,         fpstore,   fpstore,   fpload,    fpload,     vecsimple,
+                vecsimple, vecsimple, vecsimple, vecsimple, vecsimple,  vecsimple,
+                vecsimple")])

 (define_split
  [(set (match_operand:DI 0 "gpc_reg_operand" "")
@ -7744,9 +7776,26 @@
  [(pc)]
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; })

+;;              GPR store  GPR load   GPR move   GPR li     GPR lis     GPR #
+;;              FPR store  FPR load   FPR move   AVX store  AVX store   AVX load
+;;              AVX load   VSX move   P9 0       P9 -1      AVX 0/-1    VSX 0
+;;              VSX -1     P9 const   AVX const  From SPR   To SPR      SPR<->SPR
+;;              FPR->GPR   GPR->FPR   VSX->GPR   GPR->VSX
 (define_insn "*movdi_internal64"
-  [(set (match_operand:DI 0 "nonimmediate_operand" "=Y,r,r,r,r,r,?m,?*d,?*d,r,*h,*h,r,?*wg,r,?*wj,?*wi")
-	(match_operand:DI 1 "input_operand" "r,Y,r,I,L,nF,d,m,d,*h,r,0,*wg,r,*wj,r,O"))]
+  [(set (match_operand:DI 0 "nonimmediate_operand"
+               "=Y,        r,         r,         r,         r,          r,
+                ?m,        ?*d,       ?*d,       ?Y,        ?Z,         ?*wb,
+                ?*wv,      ?wi,       ?wo,       ?wo,       ?wv,        ?wi,
+                ?wi,       ?wv,       ?wv,       r,         *h,         *h,
+                ?*r,       ?*wg,      ?*r,       ?*wj")
+
+	(match_operand:DI 1 "input_operand"
+                "r,        Y,         r,         I,         L,          nF,
+                 d,        m,         d,         wb,        wv,         Y,
+                 Z,        wi,        Oj,        wM,        OjwM,       Oj,
+                 wM,       wS,        wB,        *h,        r,          0,
+                 wg,       r,         wj,        r"))]
+
  "TARGET_POWERPC64
   && (gpc_reg_operand (operands[0], DImode)
       || gpc_reg_operand (operands[1], DImode))"
@ -7760,21 +7809,43 @@
   stfd%U0%X0 %1,%0
   lfd%U1%X1 %0,%1
   fmr %0,%1
+   stxsd %1,%0
+   stxsdx %x1,%y0
+   lxsd %0,%1
+   lxsdx %x0,%y1
+   xxlor %x0,%x1,%x1
+   xxspltib %x0,0
+   xxspltib %x0,255
+   vspltisw %0,%1
+   xxlxor %x0,%x0,%x0
+   xxlorc %x0,%x0,%x0
+   #
+   #
   mf%1 %0
   mt%0 %1
   nop
   mftgpr %0,%1
   mffgpr %0,%1
   mfvsrd %0,%x1
-   mtvsrd %x0,%1
-   xxlxor %x0,%x0,%x0"
-  [(set_attr "type" "store,load,*,*,*,*,fpstore,fpload,fp,mfjmpr,mtjmpr,*,mftgpr,mffgpr,mftgpr,mffgpr,vecsimple")
-   (set_attr "length" "4,4,4,4,4,20,4,4,4,4,4,4,4,4,4,4,4")])
+   mtvsrd %x0,%1"
+  [(set_attr "type"
+               "store,     load,      *,         *,         *,          *,
+                fpstore,   fpload,    fp,        fpstore,   fpstore,    fpload,
+                fpload,    vecsimple, vecsimple, vecsimple, vecsimple,  vecsimple,
+                vecsimple, vecsimple, vecsimple, mfjmpr,    mtjmpr,     *,
+                mftgpr,    mffgpr,    mftgpr,    mffgpr")
+
+   (set_attr "length"
+               "4,         4,         4,         4,         4,          20,
+                4,         4,         4,         4,         4,          4,
+                4,         4,         4,         4,         4,          8,
+                8,         4,         4,         4,         4,          4,
+                4,         4,         4,         4")])

 ; Some DImode loads are best done as a load of -1 followed by a mask
 ; instruction.
 (define_split
-  [(set (match_operand:DI 0 "gpc_reg_operand")
+  [(set (match_operand:DI 0 "int_reg_operand_not_pseudo")
 	(match_operand:DI 1 "const_int_operand"))]
  "TARGET_POWERPC64
   && num_insns_constant (operands[1], DImode) > 1
@ -7791,7 +7862,7 @@
 ;; When non-easy constants can go in the TOC, this should use
 ;; easy_fp_constant predicate.
 (define_split
-  [(set (match_operand:DI 0 "gpc_reg_operand" "")
+  [(set (match_operand:DI 0 "int_reg_operand_not_pseudo" "")
 	(match_operand:DI 1 "const_int_operand" ""))]
  "TARGET_POWERPC64 && num_insns_constant (operands[1], DImode) > 1"
  [(set (match_dup 0) (match_dup 2))
@ -7805,7 +7876,7 @@
 }")

 (define_split
-  [(set (match_operand:DI 0 "gpc_reg_operand" "")
+  [(set (match_operand:DI 0 "int_reg_operand_not_pseudo" "")
 	(match_operand:DI 1 "const_scalar_int_operand" ""))]
  "TARGET_POWERPC64 && num_insns_constant (operands[1], DImode) > 1"
  [(set (match_dup 0) (match_dup 2))
@ -7817,6 +7888,43 @@
  else
    FAIL;
 }")
+
+(define_split
+  [(set (match_operand:DI 0 "altivec_register_operand" "")
+	(match_operand:DI 1 "s5bit_cint_operand" ""))]
+  "TARGET_UPPER_REGS_DI && TARGET_VSX && reload_completed"
+  [(const_int 0)]
+{
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  int r = REGNO (op0);
+  rtx op0_v4si = gen_rtx_REG (V4SImode, r);
+
+  emit_insn (gen_altivec_vspltisw (op0_v4si, op1));
+  if (op1 != const0_rtx && op1 != constm1_rtx)
+    {
+      rtx op0_v2di = gen_rtx_REG (V2DImode, r);
+      emit_insn (gen_altivec_vupkhsw (op0_v2di, op0_v4si));
+    }
+  DONE;
+})
+
+(define_split
+  [(set (match_operand:DI 0 "altivec_register_operand" "")
+	(match_operand:DI 1 "xxspltib_constant_split" ""))]
+  "TARGET_UPPER_REGS_DI && TARGET_P9_VECTOR && reload_completed"
+  [(const_int 0)]
+{
+  rtx op0 = operands[0];
+  rtx op1 = operands[1];
+  int r = REGNO (op0);
+  rtx op0_v16qi = gen_rtx_REG (V16QImode, r);
+
+  emit_insn (gen_xxspltib_v16qi (op0_v16qi, op1));
+  emit_insn (gen_vsx_sign_extend_qi_di (operands[0], op0_v16qi));
+  DONE;
+})
+

 ;; TImode/PTImode is similar, except that we usually want to compute the
 ;; address into a register and use lsi/stsi (the exception is during reload).
--- a/gcc/config/rs6000/rs6000.opt
+++ b/gcc/config/rs6000/rs6000.opt
@ -597,6 +597,10 @@ mupper-regs
 Target Report Var(TARGET_UPPER_REGS) Init(-1) Save
 Allow float/double variables in upper registers if cpu allows it.

+mupper-regs-di
+Target Report Mask(UPPER_REGS_DI) Var(rs6000_isa_flags)
+Allow 64-bit integer variables in upper registers with -mcpu=power7 or -mvsx.
+
 moptimize-swaps
 Target Undocumented Var(rs6000_optimize_swaps) Init(1) Save
 Analyze and remove doubleword swaps from VSX computations.
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@ -260,7 +260,7 @@
 			    (V2DI	"wi")])

 ;; Iterators for loading constants with xxspltib
-(define_mode_iterator VSINT_84  [V4SI V2DI])
+(define_mode_iterator VSINT_84  [V4SI V2DI DI])
 (define_mode_iterator VSINT_842 [V8HI V4SI V2DI])

 ;; Constants for creating unspecs
@ -2095,77 +2095,69 @@
  [(set_attr "type" "vecperm")])

 ;; Extract a DF/DI element from V2DF/V2DI
-(define_expand "vsx_extract_<mode>"
-  [(set (match_operand:<VS_scalar> 0 "register_operand" "")
-	(vec_select:<VS_scalar> (match_operand:VSX_D 1 "register_operand" "")
-		       (parallel
-			[(match_operand:QI 2 "u5bit_cint_operand" "")])))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)"
-  "")
-
 ;; Optimize cases were we can do a simple or direct move.
 ;; Or see if we can avoid doing the move at all
-(define_insn "*vsx_extract_<mode>_internal1"
-  [(set (match_operand:<VS_scalar> 0 "register_operand" "=d,<VS_64reg>,r,r")
+
+;; There are some unresolved problems with reload that show up if an Altivec
+;; register was picked.  Limit the scalar value to FPRs for now.
+
+(define_insn "vsx_extract_<mode>"
+  [(set (match_operand:<VS_scalar> 0 "gpc_reg_operand"
+            "=d,     wm,      wo,    d")
+
 	(vec_select:<VS_scalar>
-	 (match_operand:VSX_D 1 "register_operand" "d,<VS_64reg>,<VS_64dm>,<VS_64dm>")
+	 (match_operand:VSX_D 1 "gpc_reg_operand"
+            "<VSa>, <VSa>,  <VSa>,  <VSa>")
+
 	 (parallel
-	  [(match_operand:QI 2 "vsx_scalar_64bit" "wD,wD,wD,wL")])))]
-  "VECTOR_MEM_VSX_P (<MODE>mode) && TARGET_POWERPC64 && TARGET_DIRECT_MOVE"
+	  [(match_operand:QI 2 "const_0_to_1_operand"
+            "wD,    wD,     wL,     n")])))]
+  "VECTOR_MEM_VSX_P (<MODE>mode)"
 {
+  int element = INTVAL (operands[2]);
  int op0_regno = REGNO (operands[0]);
  int op1_regno = REGNO (operands[1]);
-
-  if (op0_regno == op1_regno)
-    return "nop";
-
-  if (INT_REGNO_P (op0_regno))
-    return ((INTVAL (operands[2]) == VECTOR_ELEMENT_MFVSRLD_64BIT)
-	    ? "mfvsrdl %0,%x1"
-	    : "mfvsrd %0,%x1");
-
-  if (FP_REGNO_P (op0_regno) && FP_REGNO_P (op1_regno))
-    return "fmr %0,%1";
-
-  return "xxlor %x0,%x1,%x1";
-}
-  [(set_attr "type" "fp,vecsimple,mftgpr,mftgpr")
-   (set_attr "length" "4")])
-
-(define_insn "*vsx_extract_<mode>_internal2"
-  [(set (match_operand:<VS_scalar> 0 "vsx_register_operand" "=d,<VS_64reg>,<VS_64reg>")
-	(vec_select:<VS_scalar>
-	 (match_operand:VSX_D 1 "vsx_register_operand" "d,wd,wd")
-	 (parallel [(match_operand:QI 2 "u5bit_cint_operand" "wD,wD,i")])))]
-  "VECTOR_MEM_VSX_P (<MODE>mode)
-   && (!TARGET_POWERPC64 || !TARGET_DIRECT_MOVE
-       || INTVAL (operands[2]) != VECTOR_ELEMENT_SCALAR_64BIT)"
-{
  int fldDM;
-  gcc_assert (UINTVAL (operands[2]) <= 1);

-  if (INTVAL (operands[2]) == VECTOR_ELEMENT_SCALAR_64BIT)
+  gcc_assert (IN_RANGE (element, 0, 1));
+  gcc_assert (VSX_REGNO_P (op1_regno));
+
+  if (element == VECTOR_ELEMENT_SCALAR_64BIT)
    {
-      int op0_regno = REGNO (operands[0]);
-      int op1_regno = REGNO (operands[1]);
-
      if (op0_regno == op1_regno)
-	return "nop";
+	return ASM_COMMENT_START " vec_extract to same register";

-      if (FP_REGNO_P (op0_regno) && FP_REGNO_P (op1_regno))
+      else if (INT_REGNO_P (op0_regno) && TARGET_DIRECT_MOVE
+	       && TARGET_POWERPC64)
+	return "mfvsrd %0,%x1";
+
+      else if (FP_REGNO_P (op0_regno) && FP_REGNO_P (op1_regno))
 	return "fmr %0,%1";

-      return "xxlor %x0,%x1,%x1";
+      else if (VSX_REGNO_P (op0_regno))
+	return "xxlor %x0,%x1,%x1";
+
+      else
+	gcc_unreachable ();
    }

-  fldDM = INTVAL (operands[2]) << 1;
-  if (!BYTES_BIG_ENDIAN)
-    fldDM = 3 - fldDM;
-  operands[3] = GEN_INT (fldDM);
-  return "xxpermdi %x0,%x1,%x1,%3";
+  else if (element == VECTOR_ELEMENT_MFVSRLD_64BIT && INT_REGNO_P (op0_regno)
+	   && TARGET_P9_VECTOR && TARGET_POWERPC64 && TARGET_DIRECT_MOVE)
+    return "mfvsrdl %0,%x1";
+
+  else if (VSX_REGNO_P (op0_regno))
+    {
+      fldDM = element << 1;
+      if (!BYTES_BIG_ENDIAN)
+	fldDM = 3 - fldDM;
+      operands[3] = GEN_INT (fldDM);
+      return "xxpermdi %x0,%x1,%x1,%3";
+    }
+
+  else
+    gcc_unreachable ();
 }
-  [(set_attr "type" "fp,vecsimple,vecperm")
-   (set_attr "length" "4")])
+  [(set_attr "type" "vecsimple,mftgpr,mftgpr,vecperm")])

 ;; Optimize extracting a single scalar element from memory if the scalar is in
 ;; the correct location to use a single load.
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@ -1009,6 +1009,7 @@ See RS/6000 and PowerPC Options.
 -mquad-memory-atomic -mno-quad-memory-atomic @gol
 -mcompat-align-parm -mno-compat-align-parm @gol
 -mupper-regs-df -mno-upper-regs-df -mupper-regs-sf -mno-upper-regs-sf @gol
+-mupper-regs-di -mno-upper-regs-di @gol
 -mupper-regs -mno-upper-regs -mmodulo -mno-modulo @gol
 -mfloat128 -mno-float128 -mfloat128-hardware -mno-float128-hardware @gol
 -mpower9-fusion -mno-mpower9-fusion -mpower9-vector -mno-power9-vector @gol
@ -20255,6 +20256,17 @@ Generate code that uses (does not use) the atomic quad word memory
 instructions.  The @option{-mquad-memory-atomic} option requires use of
 64-bit mode.

+@item -mupper-regs-di
+@itemx -mno-upper-regs-di
+@opindex mupper-regs-di
+@opindex mno-upper-regs-di
+Generate code that uses (does not use) the scalar instructions that
+target all 64 registers in the vector/scalar floating point register
+set that were added in version 2.06 of the PowerPC ISA when processing
+integers.  @option{-mupper-regs-di} is turned on by default if you use
+any of the @option{-mcpu=power7}, @option{-mcpu=power8},
+@option{-mcpu=power9}, or @option{-mvsx} options.
+
@item -mupper-regs-df
@itemx -mno-upper-regs-df
@opindex mupper-regs-df
@ -20263,8 +20275,8 @@ Generate code that uses (does not use) the scalar double precision
 instructions that target all 64 registers in the vector/scalar
 floating point register set that were added in version 2.06 of the
 PowerPC ISA.  @option{-mupper-regs-df} is turned on by default if you
-use any of the @option{-mcpu=power7}, @option{-mcpu=power8}, or
-@option{-mvsx} options.
+use any of the @option{-mcpu=power7}, @option{-mcpu=power8},
+@option{-mcpu=power9}, or @option{-mvsx} options.

@item -mupper-regs-sf
@itemx -mno-upper-regs-sf
@ -20274,8 +20286,8 @@ Generate code that uses (does not use) the scalar single precision
 instructions that target all 64 registers in the vector/scalar
 floating point register set that were added in version 2.07 of the
 PowerPC ISA.  @option{-mupper-regs-sf} is turned on by default if you
-use either of the @option{-mcpu=power8} or @option{-mpower8-vector}
-options.
+use either of the @option{-mcpu=power8}, @option{-mpower8-vector}, or
+@option{-mpower9} options.

@item -mupper-regs
@itemx -mno-upper-regs
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@ -3211,6 +3211,9 @@ FP or VSX register to perform ISA 2.07 float ops or NO_REGS.
@item wz
 Floating point register if the LFIWZX instruction is enabled or NO_REGS.

+@item wB
+Signed 5-bit constant integer that can be loaded into an altivec register.
+
@item wD
 Int constant that is the element number of the 64-bit scalar in a vector.

--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@ -1,3 +1,8 @@
+2016-06-15  Michael Meissner  <meissner@linux.vnet.ibm.com>
+
+	* gcc.target/powerpc/p9-dimode1.c: New test.
+	* gcc.target/powerpc/p9-dimode2.c: Likewise.
+
 2016-06-15  Jakub Jelinek  <jakub@redhat.com>

 	* gcc.c-torture/compile/20160615-1.c: New test.
--- a/gcc/testsuite/gcc.target/powerpc/p9-dimode1.c
+++ b/gcc/testsuite/gcc.target/powerpc/p9-dimode1.c
@ -0,0 +1,50 @@
+/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2 -mupper-regs-di" } */
+
+/* Verify P9 changes to allow DImode into Altivec registers, and generate
+   constants using XXSPLTIB.  */
+
+#ifndef _ARCH_PPC64
+#error "This code is 64-bit."
+#endif
+
+double
+p9_zero (void)
+{
+  long l = 0;
+  double ret;
+
+  __asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l));
+
+  return ret;
+}
+
+double
+p9_plus_1 (void)
+{
+  long l = 1;
+  double ret;
+
+  __asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l));
+
+  return ret;
+}
+
+double
+p9_minus_1 (void)
+{
+  long l = -1;
+  double ret;
+
+  __asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l));
+
+  return ret;
+}
+
+/* { dg-final { scan-assembler     "xxspltib" } } */
+/* { dg-final { scan-assembler-not "mtvsrd"   } } */
+/* { dg-final { scan-assembler-not "lfd"      } } */
+/* { dg-final { scan-assembler-not "ld"       } } */
+/* { dg-final { scan-assembler-not "lxsd"     } } */
--- a/gcc/testsuite/gcc.target/powerpc/p9-dimode2.c
+++ b/gcc/testsuite/gcc.target/powerpc/p9-dimode2.c
@ -0,0 +1,27 @@
+/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mcpu=power9 -O2 -mupper-regs-di" } */
+
+/* Verify that large integer constants are loaded via direct move instead of being
+   loaded from memory.  */
+
+#ifndef _ARCH_PPC64
+#error "This code is 64-bit."
+#endif
+
+double
+p9_large (void)
+{
+  long l = 0x12345678;
+  double ret;
+
+  __asm__ ("xxlor %x0,%x1,%x1" : "=&d" (ret) : "wi" (l));
+
+  return ret;
+}
+
+/* { dg-final { scan-assembler     "mtvsrd"   } } */
+/* { dg-final { scan-assembler-not "ld"       } } */
+/* { dg-final { scan-assembler-not "lfd"      } } */
+/* { dg-final { scan-assembler-not "lxsd"     } } */