Stack usage support

Stack usage support
	* common.opt (-fstack-usage): New option.
	* doc/invoke.texi (Debugging options): Document it.
	* builtins.c (expand_builtin_apply): Pass TRUE as 4th argument to
	allocate_dynamic_stack_space.
	(expand_builtin_alloca): Add 4th bool parameter CANNOT_ACCUMULATE
	and propagate it to allocate_dynamic_stack_space.
	(expand_builtin) <BUILT_IN_ALLOCA>: Adjust for above change.
	* calls.c (initialize_argument_information): Pass TRUE as 4th
	argument to allocate_dynamic_stack_space.
	(expand_call): Set current_function_has_unbounded_dynamic_stack_size
	to 1 when pushing a variable-sized argument onto the stack.  Pass
	TRUE as 4th argument to allocate_dynamic_stack_space.
	Update current_function_pushed_stack_size.
	(emit_library_call_value_1): Likewise.
	* explow.c (allocate_dynamic_stack_space): Add 4th bool parameter
	CANNOT_ACCUMULATE.  If flag_stack_usage, look into the size and
	attempt to find an upper bound.  Remove redundant code for the
	SETJMP_VIA_SAVE_AREA case.
	* expr.h (allocate_dynamic_stack_space): Add 4th bool parameter.
	* function.h (struct stack_usage): New structure.
	(current_function_static_stack_size): New macro.
	(current_function_dynamic_stack_size): Likewise.
	(current_function_pushed_stack_size): Likewise.
	(current_function_dynamic_alloc_count): Likewise.
	(current_function_has_unbounded_dynamic_stack_size): Likewise.
	(current_function_allocates_dynamic_stack_space): Likewise.
	(struct function): Add new field 'su'.
	* function.c (instantiate_virtual_regs): If SETJMP_VIA_SAVE_AREA,
	add the value of the dynamic offset to the dynamic stack usage.
	(gimplify_parameters): Set ALLOCA_FOR_VAR_P on call to BUILT_IN_ALLOCA
	for variable-sized objects.
	(prepare_function_start): Allocate cfun->su if flag_stack_usage.
	(rest_of_handle_thread_prologue_and_epilogue): Call output_stack_usage.
	* gimplify.c (gimplify_decl_expr): Set ALLOCA_FOR_VAR_P on call to
	BUILT_IN_ALLOCA for variable-sized objects.
	* output.h (output_stack_usage): Declare.
	* toplev.c (stack_usage_file): New file pointer.
	(output_stack_usage): New function.
	(open_auxiliary_file): Likewise.
	(lang_dependent_init): Open file if flag_stack_usage is set.
	(finalize): Close file if stack_usage_file is not null.
	* tree.h (ALLOCA_FOR_VAR_P): New macro.
	* config/alpha/alpha.c (compute_frame_size): New function.
	(alpha_expand_prologue): Use it.
	(alpha_start_function): Likewise.
	(alpha_expand_epilogue): Likewise.  Set stack usage info.
	* config/i386/i386.c (ix86_expand_prologue): Likewise.
	* config/ia64/ia64.c (ia64_expand_prologue): Likewise.
	* config/mips/mips.c (mips_expand_prologue): Likewise.
	* config/pa/pa.c (hppa_expand_prologue): Likewise.
	* config/rs6000/rs6000.c (rs6000_emit_prologue): Likewise.
	* config/sparc/sparc.c (sparc_expand_prologue): Likewise.
testsuite/
	* lib/gcc-dg.exp (cleanup-stack-usage): New procedure.
	* lib/scanasm.exp (scan-stack-usage): Likewise.
	(scan-stack-usage-not): Likewise.
	* gcc.dg/stack-usage-1.c: New test.
	* gcc.target/i386/stack-usage-realign.c: Likewise.

From-SVN: r163660
This commit is contained in:
Eric Botcazou 2010-08-30 20:04:49 +00:00
parent 1987baa3ab
commit d3c1230697
25 changed files with 557 additions and 110 deletions

View File

@ -1,3 +1,59 @@
2010-08-30 Eric Botcazou <ebotcazou@adacore.com>
Stack usage support
* common.opt (-fstack-usage): New option.
* doc/invoke.texi (Debugging options): Document it.
* builtins.c (expand_builtin_apply): Pass TRUE as 4th argument to
allocate_dynamic_stack_space.
(expand_builtin_alloca): Add 4th bool parameter CANNOT_ACCUMULATE
and propagate it to allocate_dynamic_stack_space.
(expand_builtin) <BUILT_IN_ALLOCA>: Adjust for above change.
* calls.c (initialize_argument_information): Pass TRUE as 4th
argument to allocate_dynamic_stack_space.
(expand_call): Set current_function_has_unbounded_dynamic_stack_size
to 1 when pushing a variable-sized argument onto the stack. Pass
TRUE as 4th argument to allocate_dynamic_stack_space.
Update current_function_pushed_stack_size.
(emit_library_call_value_1): Likewise.
* explow.c (allocate_dynamic_stack_space): Add 4th bool parameter
CANNOT_ACCUMULATE. If flag_stack_usage, look into the size and
attempt to find an upper bound. Remove redundant code for the
SETJMP_VIA_SAVE_AREA case.
* expr.h (allocate_dynamic_stack_space): Add 4th bool parameter.
* function.h (struct stack_usage): New structure.
(current_function_static_stack_size): New macro.
(current_function_dynamic_stack_size): Likewise.
(current_function_pushed_stack_size): Likewise.
(current_function_dynamic_alloc_count): Likewise.
(current_function_has_unbounded_dynamic_stack_size): Likewise.
(current_function_allocates_dynamic_stack_space): Likewise.
(struct function): Add new field 'su'.
* function.c (instantiate_virtual_regs): If SETJMP_VIA_SAVE_AREA,
add the value of the dynamic offset to the dynamic stack usage.
(gimplify_parameters): Set ALLOCA_FOR_VAR_P on call to BUILT_IN_ALLOCA
for variable-sized objects.
(prepare_function_start): Allocate cfun->su if flag_stack_usage.
(rest_of_handle_thread_prologue_and_epilogue): Call output_stack_usage.
* gimplify.c (gimplify_decl_expr): Set ALLOCA_FOR_VAR_P on call to
BUILT_IN_ALLOCA for variable-sized objects.
* output.h (output_stack_usage): Declare.
* toplev.c (stack_usage_file): New file pointer.
(output_stack_usage): New function.
(open_auxiliary_file): Likewise.
(lang_dependent_init): Open file if flag_stack_usage is set.
(finalize): Close file if stack_usage_file is not null.
* tree.h (ALLOCA_FOR_VAR_P): New macro.
* config/alpha/alpha.c (compute_frame_size): New function.
(alpha_expand_prologue): Use it.
(alpha_start_function): Likewise.
(alpha_expand_epilogue): Likewise. Set stack usage info.
* config/i386/i386.c (ix86_expand_prologue): Likewise.
* config/ia64/ia64.c (ia64_expand_prologue): Likewise.
* config/mips/mips.c (mips_expand_prologue): Likewise.
* config/pa/pa.c (hppa_expand_prologue): Likewise.
* config/rs6000/rs6000.c (rs6000_emit_prologue): Likewise.
* config/sparc/sparc.c (sparc_expand_prologue): Likewise.
2010-08-30 Zdenek Dvorak <ook@ucw.cz>
PR tree-optimization/45427
@ -6,7 +62,7 @@
(number_of_iterations_ne): Pass exit_must_be_taken to
number_of_iterations_ne_max.
2010-08-31 Catherine Moore <clm@codesourcery.com>
2010-08-30 Catherine Moore <clm@codesourcery.com>
* config/mips/mips.h (BASE_DRIVER_SELF_SPECS):
Infer -mdspr2 for the the 74K.

View File

@ -132,7 +132,7 @@ static rtx expand_builtin_memset (tree, rtx, enum machine_mode);
static rtx expand_builtin_memset_args (tree, tree, tree, rtx, enum machine_mode, tree);
static rtx expand_builtin_bzero (tree);
static rtx expand_builtin_strlen (tree, rtx, enum machine_mode);
static rtx expand_builtin_alloca (tree, rtx);
static rtx expand_builtin_alloca (tree, rtx, bool);
static rtx expand_builtin_unop (enum machine_mode, tree, rtx, rtx, optab);
static rtx expand_builtin_frame_address (tree, tree);
static tree stabilize_va_list_loc (location_t, tree, int);
@ -1588,8 +1588,10 @@ expand_builtin_apply (rtx function, rtx arguments, rtx argsize)
emit_stack_save (SAVE_BLOCK, &old_stack_level, NULL_RTX);
/* Allocate a block of memory onto the stack and copy the memory
arguments to the outgoing arguments address. */
allocate_dynamic_stack_space (argsize, 0, BITS_PER_UNIT);
arguments to the outgoing arguments address. We can pass TRUE
as the 4th argument because we just saved the stack pointer
and will restore it right after the call. */
allocate_dynamic_stack_space (argsize, 0, BITS_PER_UNIT, TRUE);
/* Set DRAP flag to true, even though allocate_dynamic_stack_space
may have already set current_function_calls_alloca to true.
@ -4949,12 +4951,13 @@ expand_builtin_frame_address (tree fndecl, tree exp)
}
}
/* Expand EXP, a call to the alloca builtin. Return NULL_RTX if
we failed and the caller should emit a normal call, otherwise try to get
the result in TARGET, if convenient. */
/* Expand EXP, a call to the alloca builtin. Return NULL_RTX if we
failed and the caller should emit a normal call, otherwise try to
get the result in TARGET, if convenient. CANNOT_ACCUMULATE is the
same as for allocate_dynamic_stack_space. */
static rtx
expand_builtin_alloca (tree exp, rtx target)
expand_builtin_alloca (tree exp, rtx target, bool cannot_accumulate)
{
rtx op0;
rtx result;
@ -4970,7 +4973,8 @@ expand_builtin_alloca (tree exp, rtx target)
op0 = expand_normal (CALL_EXPR_ARG (exp, 0));
/* Allocate the desired space. */
result = allocate_dynamic_stack_space (op0, target, BITS_PER_UNIT);
result = allocate_dynamic_stack_space (op0, target, BITS_PER_UNIT,
cannot_accumulate);
result = convert_memory_address (ptr_mode, result);
return result;
@ -6009,7 +6013,9 @@ expand_builtin (tree exp, rtx target, rtx subtarget, enum machine_mode mode,
return XEXP (DECL_RTL (DECL_RESULT (current_function_decl)), 0);
case BUILT_IN_ALLOCA:
target = expand_builtin_alloca (exp, target);
/* If the allocation stems from the declaration of a variable-sized
object, it cannot accumulate. */
target = expand_builtin_alloca (exp, target, ALLOCA_FOR_VAR_P (exp));
if (target)
return target;
break;

View File

@ -1095,9 +1095,13 @@ initialize_argument_information (int num_actuals ATTRIBUTE_UNUSED,
pending_stack_adjust = 0;
}
/* We can pass TRUE as the 4th argument because we just
saved the stack pointer and will restore it right after
the call. */
copy = gen_rtx_MEM (BLKmode,
allocate_dynamic_stack_space
(size_rtx, NULL_RTX, TYPE_ALIGN (type)));
(size_rtx, NULL_RTX,
TYPE_ALIGN (type), TRUE));
set_mem_attributes (copy, type, 1);
}
else
@ -2492,6 +2496,8 @@ expand_call (tree exp, rtx target, int ignore)
stack_arg_under_construction = 0;
}
argblock = push_block (ARGS_SIZE_RTX (adjusted_args_size), 0, 0);
if (flag_stack_usage)
current_function_has_unbounded_dynamic_stack_size = 1;
}
else
{
@ -2653,8 +2659,11 @@ expand_call (tree exp, rtx target, int ignore)
stack_usage_map = stack_usage_map_buf;
highest_outgoing_arg_in_use = 0;
}
/* We can pass TRUE as the 4th argument because we just
saved the stack pointer and will restore it right after
the call. */
allocate_dynamic_stack_space (push_size, NULL_RTX,
BITS_PER_UNIT);
BITS_PER_UNIT, TRUE);
}
/* If argument evaluation might modify the stack pointer,
@ -2694,6 +2703,19 @@ expand_call (tree exp, rtx target, int ignore)
be deferred during the evaluation of the arguments. */
NO_DEFER_POP;
/* Record the maximum pushed stack space size. We need to delay
doing it this far to take into account the optimization done
by combine_pending_stack_adjustment_and_call. */
if (flag_stack_usage
&& !ACCUMULATE_OUTGOING_ARGS
&& pass
&& adjusted_args_size.var == 0)
{
int pushed = adjusted_args_size.constant + pending_stack_adjust;
if (pushed > current_function_pushed_stack_size)
current_function_pushed_stack_size = pushed;
}
funexp = rtx_for_function_call (fndecl, addr);
/* Figure out the register where the value, if any, will come back. */
@ -3551,6 +3573,13 @@ emit_library_call_value_1 (int retval, rtx orgfun, rtx value,
if (args_size.constant > crtl->outgoing_args_size)
crtl->outgoing_args_size = args_size.constant;
if (flag_stack_usage && !ACCUMULATE_OUTGOING_ARGS)
{
int pushed = args_size.constant + pending_stack_adjust;
if (pushed > current_function_pushed_stack_size)
current_function_pushed_stack_size = pushed;
}
if (ACCUMULATE_OUTGOING_ARGS)
{
/* Since the stack pointer will never be pushed, it is possible for

View File

@ -1281,6 +1281,10 @@ fstack-protector-all
Common Report RejectNegative Var(flag_stack_protect, 2) VarExists
Use a stack protection method for every function
fstack-usage
Common RejectNegative Var(flag_stack_usage)
Output stack usage information on a per-function basis
fstrength-reduce
Common
Does nothing. Preserved for backward compatibility.

View File

@ -7763,6 +7763,30 @@ emit_frame_store (unsigned int regno, rtx base_reg,
emit_frame_store_1 (reg, base_reg, frame_bias, base_ofs, reg);
}
/* Compute the frame size. SIZE is the size of the "naked" frame
and SA_SIZE is the size of the register save area. */
static HOST_WIDE_INT
compute_frame_size (HOST_WIDE_INT size, HOST_WIDE_INT sa_size)
{
if (TARGET_ABI_OPEN_VMS)
return ALPHA_ROUND (sa_size
+ (alpha_procedure_type == PT_STACK ? 8 : 0)
+ size
+ crtl->args.pretend_args_size);
else if (TARGET_ABI_UNICOSMK)
/* We have to allocate space for the DSIB if we generate a frame. */
return ALPHA_ROUND (sa_size
+ (alpha_procedure_type == PT_STACK ? 48 : 0))
+ ALPHA_ROUND (size
+ crtl->outgoing_args_size);
else
return ALPHA_ROUND (crtl->outgoing_args_size)
+ sa_size
+ ALPHA_ROUND (size
+ crtl->args.pretend_args_size);
}
/* Write function prologue. */
/* On vms we have two kinds of functions:
@ -7796,24 +7820,10 @@ alpha_expand_prologue (void)
int i;
sa_size = alpha_sa_size ();
frame_size = compute_frame_size (get_frame_size (), sa_size);
frame_size = get_frame_size ();
if (TARGET_ABI_OPEN_VMS)
frame_size = ALPHA_ROUND (sa_size
+ (alpha_procedure_type == PT_STACK ? 8 : 0)
+ frame_size
+ crtl->args.pretend_args_size);
else if (TARGET_ABI_UNICOSMK)
/* We have to allocate space for the DSIB if we generate a frame. */
frame_size = ALPHA_ROUND (sa_size
+ (alpha_procedure_type == PT_STACK ? 48 : 0))
+ ALPHA_ROUND (frame_size
+ crtl->outgoing_args_size);
else
frame_size = (ALPHA_ROUND (crtl->outgoing_args_size)
+ sa_size
+ ALPHA_ROUND (frame_size
+ crtl->args.pretend_args_size));
if (flag_stack_usage)
current_function_static_stack_size = frame_size;
if (TARGET_ABI_OPEN_VMS)
reg_offset = 8 + 8 * cfun->machine->uses_condition_handler;
@ -8135,23 +8145,7 @@ alpha_start_function (FILE *file, const char *fnname,
alpha_fnname = fnname;
sa_size = alpha_sa_size ();
frame_size = get_frame_size ();
if (TARGET_ABI_OPEN_VMS)
frame_size = ALPHA_ROUND (sa_size
+ (alpha_procedure_type == PT_STACK ? 8 : 0)
+ frame_size
+ crtl->args.pretend_args_size);
else if (TARGET_ABI_UNICOSMK)
frame_size = ALPHA_ROUND (sa_size
+ (alpha_procedure_type == PT_STACK ? 48 : 0))
+ ALPHA_ROUND (frame_size
+ crtl->outgoing_args_size);
else
frame_size = (ALPHA_ROUND (crtl->outgoing_args_size)
+ sa_size
+ ALPHA_ROUND (frame_size
+ crtl->args.pretend_args_size));
frame_size = compute_frame_size (get_frame_size (), sa_size);
if (TARGET_ABI_OPEN_VMS)
reg_offset = 8 + 8 * cfun->machine->uses_condition_handler;
@ -8353,23 +8347,7 @@ alpha_expand_epilogue (void)
int i;
sa_size = alpha_sa_size ();
frame_size = get_frame_size ();
if (TARGET_ABI_OPEN_VMS)
frame_size = ALPHA_ROUND (sa_size
+ (alpha_procedure_type == PT_STACK ? 8 : 0)
+ frame_size
+ crtl->args.pretend_args_size);
else if (TARGET_ABI_UNICOSMK)
frame_size = ALPHA_ROUND (sa_size
+ (alpha_procedure_type == PT_STACK ? 48 : 0))
+ ALPHA_ROUND (frame_size
+ crtl->outgoing_args_size);
else
frame_size = (ALPHA_ROUND (crtl->outgoing_args_size)
+ sa_size
+ ALPHA_ROUND (frame_size
+ crtl->args.pretend_args_size));
frame_size = compute_frame_size (get_frame_size (), sa_size);
if (TARGET_ABI_OPEN_VMS)
{

View File

@ -9613,6 +9613,29 @@ ix86_expand_prologue (void)
allocate = frame.stack_pointer_offset - m->fs.sp_offset;
if (flag_stack_usage)
{
/* We start to count from ARG_POINTER. */
HOST_WIDE_INT stack_size = frame.stack_pointer_offset;
/* If it was realigned, take into account the fake frame. */
if (stack_realign_drap)
{
if (ix86_static_chain_on_stack)
stack_size += UNITS_PER_WORD;
if (!call_used_regs[REGNO (crtl->drap_reg)])
stack_size += UNITS_PER_WORD;
/* This over-estimates by 1 minimal-stack-alignment-unit but
mitigates that by counting in the new return address slot. */
current_function_dynamic_stack_size
+= crtl->stack_alignment_needed / BITS_PER_UNIT;
}
current_function_static_stack_size = stack_size;
}
/* The stack has already been decremented by the instruction calling us
so we need to probe unconditionally to preserve the protection area. */
if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK)

View File

@ -3097,6 +3097,9 @@ ia64_expand_prologue (void)
ia64_compute_frame_size (get_frame_size ());
last_scratch_gr_reg = 15;
if (flag_stack_usage)
current_function_static_stack_size = current_frame_info.total_size;
if (dump_file)
{
fprintf (dump_file, "ia64 frame related registers "

View File

@ -10078,6 +10078,9 @@ mips_expand_prologue (void)
frame = &cfun->machine->frame;
size = frame->total_size;
if (flag_stack_usage)
current_function_static_stack_size = size;
/* Save the registers. Allocate up to MIPS_MAX_FIRST_STACK_STEP
bytes beforehand; this is enough to cover the register save area
without going out of range. */

View File

@ -3734,6 +3734,8 @@ hppa_expand_prologue (void)
local_fsize += STARTING_FRAME_OFFSET;
actual_fsize = compute_frame_size (size, &save_fregs);
if (flag_stack_usage)
current_function_static_stack_size = actual_fsize;
/* Compute a few things we will use often. */
tmpreg = gen_rtx_REG (word_mode, 1);

View File

@ -19716,6 +19716,9 @@ rs6000_emit_prologue (void)
&& call_used_regs[STATIC_CHAIN_REGNUM]);
HOST_WIDE_INT sp_offset = 0;
if (flag_stack_usage)
current_function_static_stack_size = info->total_size;
if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK && info->total_size)
rs6000_emit_probe_stack_range (STACK_CHECK_PROTECT, info->total_size);

View File

@ -4402,6 +4402,9 @@ sparc_expand_prologue (void)
/* Advertise that the data calculated just above are now valid. */
sparc_prologue_data_valid_p = true;
if (flag_stack_usage)
current_function_static_stack_size = actual_fsize;
if (flag_stack_check == STATIC_BUILTIN_STACK_CHECK && actual_fsize)
sparc_emit_probe_stack_range (STACK_CHECK_PROTECT, actual_fsize);

View File

@ -313,7 +313,7 @@ Objective-C and Objective-C++ Dialects}.
-fmem-report -fpre-ipa-mem-report -fpost-ipa-mem-report -fprofile-arcs @gol
-frandom-seed=@var{string} -fsched-verbose=@var{n} @gol
-fsel-sched-verbose -fsel-sched-dump-cfg -fsel-sched-pipelining-verbose @gol
-ftest-coverage -ftime-report -fvar-tracking @gol
-fstack-usage -ftest-coverage -ftime-report -fvar-tracking @gol
-fvar-tracking-assignments -fvar-tracking-assignments-toggle @gol
-g -g@var{level} -gtoggle -gcoff -gdwarf-@var{version} @gol
-ggdb -gstabs -gstabs+ -gstrict-dwarf -gno-strict-dwarf @gol
@ -4852,6 +4852,39 @@ allocation when it finishes.
Makes the compiler print some statistics about permanent memory
allocation before or after interprocedural optimization.
@item -fstack-usage
@opindex fstack-usage
Makes the compiler output stack usage information for the program, on a
per-function basis. The filename for the dump is made by appending
@file{.su} to the AUXNAME. AUXNAME is generated from the name of
the output file, if explicitly specified and it is not an executable,
otherwise it is the basename of the source file. An entry is made up
of three fields:
@itemize
@item
The name of the function.
@item
A number of bytes.
@item
One or more qualifiers: @code{static}, @code{dynamic}, @code{bounded}.
@end itemize
The qualifier @code{static} means that the function manipulates the stack
statically: a fixed number of bytes are allocated for the frame on function
entry and released on function exit; no stack adjustments are otherwise made
in the function. The second field is this fixed number of bytes.
The qualifier @code{dynamic} means that the function manipulates the stack
dynamically: in addition to the static allocation described above, stack
adjustments are made in the body of the function, for example to push/pop
arguments around function calls. If the qualifier @code{bounded} is also
present, the amount of these adjustments is bounded at compile-time and
the second field is an upper bound of the total amount of stack used by
the function. If it is not present, the amount of these adjustments is
not bounded at compile-time and the second field only represents the
bounded part.
@item -fprofile-arcs
@opindex fprofile-arcs
Add code so that program flow @dfn{arcs} are instrumented. During

View File

@ -1114,11 +1114,22 @@ update_nonlocal_goto_save_area (void)
SIZE is an rtx representing the size of the area.
TARGET is a place in which the address can be placed.
KNOWN_ALIGN is the alignment (in bits) that we know SIZE has. */
KNOWN_ALIGN is the alignment (in bits) that we know SIZE has.
If CANNOT_ACCUMULATE is set to TRUE, the caller guarantees that the
stack space allocated by the generated code cannot be added with itself
in the course of the execution of the function. It is always safe to
pass FALSE here and the following criterion is sufficient in order to
pass TRUE: every path in the CFG that starts at the allocation point and
loops to it executes the associated deallocation code. */
rtx
allocate_dynamic_stack_space (rtx size, rtx target, int known_align)
allocate_dynamic_stack_space (rtx size, rtx target, int known_align,
bool cannot_accumulate)
{
HOST_WIDE_INT stack_usage_size = -1;
bool known_align_valid = true;
/* If we're asking for zero bytes, it doesn't matter what we point
to since we can't dereference it. But return a reasonable
address anyway. */
@ -1128,6 +1139,37 @@ allocate_dynamic_stack_space (rtx size, rtx target, int known_align)
/* Otherwise, show we're calling alloca or equivalent. */
cfun->calls_alloca = 1;
/* If stack usage info is requested, look into the size we are passed.
We need to do so this early to avoid the obfuscation that may be
introduced later by the various alignment operations. */
if (flag_stack_usage)
{
if (GET_CODE (size) == CONST_INT)
stack_usage_size = INTVAL (size);
else if (GET_CODE (size) == REG)
{
/* Look into the last emitted insn and see if we can deduce
something for the register. */
rtx insn, set, note;
insn = get_last_insn ();
if ((set = single_set (insn)) && rtx_equal_p (SET_DEST (set), size))
{
if (GET_CODE (SET_SRC (set)) == CONST_INT)
stack_usage_size = INTVAL (SET_SRC (set));
else if ((note = find_reg_equal_equiv_note (insn))
&& GET_CODE (XEXP (note, 0)) == CONST_INT)
stack_usage_size = INTVAL (XEXP (note, 0));
}
}
/* If the size is not constant, we can't say anything. */
if (stack_usage_size == -1)
{
current_function_has_unbounded_dynamic_stack_size = 1;
stack_usage_size = 0;
}
}
/* Ensure the size is in the proper mode. */
if (GET_MODE (size) != VOIDmode && GET_MODE (size) != Pmode)
size = convert_to_mode (Pmode, size, 1);
@ -1157,10 +1199,17 @@ allocate_dynamic_stack_space (rtx size, rtx target, int known_align)
#endif
if (MUST_ALIGN)
size
= force_operand (plus_constant (size,
BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
NULL_RTX);
{
size
= force_operand (plus_constant (size,
BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1),
NULL_RTX);
if (flag_stack_usage)
stack_usage_size += BIGGEST_ALIGNMENT / BITS_PER_UNIT - 1;
known_align_valid = false;
}
#ifdef SETJMP_VIA_SAVE_AREA
/* If setjmp restores regs from a save area in the stack frame,
@ -1174,32 +1223,7 @@ allocate_dynamic_stack_space (rtx size, rtx target, int known_align)
would use reg notes to store the "optimized" size and fix things
up later. These days we know this information before we ever
start building RTL so the reg notes are unnecessary. */
if (!cfun->calls_setjmp)
{
int align = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT;
/* ??? Code below assumes that the save area needs maximal
alignment. This constraint may be too strong. */
gcc_assert (PREFERRED_STACK_BOUNDARY == BIGGEST_ALIGNMENT);
if (CONST_INT_P (size))
{
HOST_WIDE_INT new_size = INTVAL (size) / align * align;
if (INTVAL (size) != new_size)
size = GEN_INT (new_size);
}
else
{
/* Since we know overflow is not possible, we avoid using
CEIL_DIV_EXPR and use TRUNC_DIV_EXPR instead. */
size = expand_divmod (0, TRUNC_DIV_EXPR, Pmode, size,
GEN_INT (align), NULL_RTX, 1);
size = expand_mult (Pmode, size,
GEN_INT (align), NULL_RTX, 1);
}
}
else
if (cfun->calls_setjmp)
{
rtx dynamic_offset
= expand_binop (Pmode, sub_optab, virtual_stack_dynamic_rtx,
@ -1207,6 +1231,14 @@ allocate_dynamic_stack_space (rtx size, rtx target, int known_align)
size = expand_binop (Pmode, add_optab, size, dynamic_offset,
NULL_RTX, 1, OPTAB_LIB_WIDEN);
/* The above dynamic offset cannot be computed statically at this
point, but it will be possible to do so after RTL expansion is
done. Record how many times we will need to add it. */
if (flag_stack_usage)
current_function_dynamic_alloc_count++;
known_align_valid = false;
}
#endif /* SETJMP_VIA_SAVE_AREA */
@ -1223,13 +1255,28 @@ allocate_dynamic_stack_space (rtx size, rtx target, int known_align)
insns. Since this is an extremely rare event, we have no reliable
way of knowing which systems have this problem. So we avoid even
momentarily mis-aligning the stack. */
if (!known_align_valid || known_align % PREFERRED_STACK_BOUNDARY != 0)
{
size = round_push (size);
/* If we added a variable amount to SIZE,
we can no longer assume it is aligned. */
#if !defined (SETJMP_VIA_SAVE_AREA)
if (MUST_ALIGN || known_align % PREFERRED_STACK_BOUNDARY != 0)
#endif
size = round_push (size);
if (flag_stack_usage)
{
int align = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT;
stack_usage_size = (stack_usage_size + align - 1) / align * align;
}
}
/* The size is supposed to be fully adjusted at this point so record it
if stack usage info is requested. */
if (flag_stack_usage)
{
current_function_dynamic_stack_size += stack_usage_size;
/* ??? This is gross but the only safe stance in the absence
of stack usage oriented flow analysis. */
if (!cannot_accumulate)
current_function_has_unbounded_dynamic_stack_size = 1;
}
do_pending_stack_adjust ();

View File

@ -641,9 +641,8 @@ extern void emit_stack_restore (enum save_level, rtx, rtx);
/* Invoke emit_stack_save for the nonlocal_goto_save_area. */
extern void update_nonlocal_goto_save_area (void);
/* Allocate some space on the stack dynamically and return its address. An rtx
says how many bytes. */
extern rtx allocate_dynamic_stack_space (rtx, rtx, int);
/* Allocate some space on the stack dynamically and return its address. */
extern rtx allocate_dynamic_stack_space (rtx, rtx, int, bool);
/* Emit one stack probe at ADDRESS, an address within the stack. */
extern void emit_stack_probe (rtx);

View File

@ -1899,6 +1899,18 @@ instantiate_virtual_regs (void)
/* Indicate that, from now on, assign_stack_local should use
frame_pointer_rtx. */
virtuals_instantiated = 1;
/* See allocate_dynamic_stack_space for the rationale. */
#ifdef SETJMP_VIA_SAVE_AREA
if (flag_stack_usage && cfun->calls_setjmp)
{
int align = PREFERRED_STACK_BOUNDARY / BITS_PER_UNIT;
dynamic_offset = (dynamic_offset + align - 1) / align * align;
current_function_dynamic_stack_size
+= current_function_dynamic_alloc_count * dynamic_offset;
}
#endif
return 0;
}
@ -3586,6 +3598,8 @@ gimplify_parameters (void)
t = built_in_decls[BUILT_IN_ALLOCA];
t = build_call_expr (t, 1, DECL_SIZE_UNIT (parm));
/* The call has been built for a variable-sized object. */
ALLOCA_FOR_VAR_P (t) = 1;
t = fold_convert (ptr_type, t);
t = build2 (MODIFY_EXPR, TREE_TYPE (addr), addr, t);
gimplify_and_add (t, &stmts);
@ -4365,6 +4379,12 @@ prepare_function_start (void)
init_expr ();
default_rtl_profile ();
if (flag_stack_usage)
{
cfun->su = ggc_alloc_cleared_stack_usage ();
cfun->su->static_stack_size = -1;
}
cse_not_expected = ! optimize;
/* Caller save not needed yet. */
@ -5753,12 +5773,17 @@ rest_of_handle_thread_prologue_and_epilogue (void)
{
if (optimize)
cleanup_cfg (CLEANUP_EXPENSIVE);
/* On some machines, the prologue and epilogue code, or parts thereof,
can be represented as RTL. Doing so lets us schedule insns between
it and the rest of the code and also allows delayed branch
scheduling to operate in the epilogue. */
thread_prologue_and_epilogue_insns ();
/* The stack usage info is finalized during prologue expansion. */
if (flag_stack_usage)
output_stack_usage ();
return 0;
}

View File

@ -468,6 +468,37 @@ extern GTY(()) struct rtl_data x_rtl;
want to do differently. */
#define crtl (&x_rtl)
struct GTY(()) stack_usage
{
/* # of bytes of static stack space allocated by the function. */
HOST_WIDE_INT static_stack_size;
/* # of bytes of dynamic stack space allocated by the function. This is
meaningful only if has_unbounded_dynamic_stack_size is zero. */
HOST_WIDE_INT dynamic_stack_size;
/* # of bytes of space pushed onto the stack after the prologue. If
!ACCUMULATE_OUTGOING_ARGS, it contains the outgoing arguments. */
int pushed_stack_size;
/* # of dynamic allocations in the function. */
unsigned int dynamic_alloc_count : 31;
/* Nonzero if the amount of stack space allocated dynamically cannot
be bounded at compile-time. */
unsigned int has_unbounded_dynamic_stack_size : 1;
};
#define current_function_static_stack_size (cfun->su->static_stack_size)
#define current_function_dynamic_stack_size (cfun->su->dynamic_stack_size)
#define current_function_pushed_stack_size (cfun->su->pushed_stack_size)
#define current_function_dynamic_alloc_count (cfun->su->dynamic_alloc_count)
#define current_function_has_unbounded_dynamic_stack_size \
(cfun->su->has_unbounded_dynamic_stack_size)
#define current_function_allocates_dynamic_stack_space \
(current_function_dynamic_stack_size != 0 \
|| current_function_has_unbounded_dynamic_stack_size)
/* This structure can save all the important global and static variables
describing the status of the current function. */
@ -486,6 +517,9 @@ struct GTY(()) function {
/* The loops in this function. */
struct loops *x_current_loops;
/* The stack usage of this function. */
struct stack_usage *su;
/* Value histograms attached to particular statements. */
htab_t GTY((skip)) value_histograms;

View File

@ -1329,6 +1329,8 @@ gimplify_vla_decl (tree decl, gimple_seq *seq_p)
t = built_in_decls[BUILT_IN_ALLOCA];
t = build_call_expr (t, 1, DECL_SIZE_UNIT (decl));
/* The call has been built for a variable-sized object. */
ALLOCA_FOR_VAR_P (t) = 1;
t = fold_convert (ptr_type, t);
t = build2 (MODIFY_EXPR, TREE_TYPE (addr), addr, t);

View File

@ -639,6 +639,9 @@ extern int maybe_assemble_visibility (tree);
extern int default_address_cost (rtx, bool);
/* Output stack usage information. */
extern void output_stack_usage (void);
/* dbxout helper functions */
#if defined DBX_DEBUGGING_INFO || defined XCOFF_DEBUGGING_INFO

View File

@ -1,3 +1,11 @@
2010-08-30 Eric Botcazou <ebotcazou@adacore.com>
* lib/gcc-dg.exp (cleanup-stack-usage): New procedure.
* lib/scanasm.exp (scan-stack-usage): Likewise.
(scan-stack-usage-not): Likewise.
* gcc.dg/stack-usage-1.c: New test.
* gcc.target/i386/stack-usage-realign.c: Likewise.
2010-08-30 Zdenek Dvorak <ook@ucw.cz>
PR tree-optimization/45427

View File

@ -0,0 +1,43 @@
/* { dg-do compile } */
/* { dg-options "-fstack-usage" } */
/* This is aimed at testing basic support for -fstack-usage in the back-ends.
See the SPARC back-end for an example (grep flag_stack_usage in sparc.c).
Once it is implemented, adjust SIZE below so that the stack usage for the
function FOO is reported as 256 or 264 in the stack usage (.su) file.
Then check that this is the actual stack usage in the assembly file. */
#if defined(__i386__)
# define SIZE 248
#elif defined(__x86_64__)
# define SIZE 356
#elif defined (__sparc__)
# if defined (__arch64__)
# define SIZE 76
# else
# define SIZE 160
# endif
#elif defined(__hppa__)
# define SIZE 192
#elif defined (__alpha__)
# define SIZE 240
#elif defined (__ia64__)
# define SIZE 272
#elif defined(__mips__)
# define SIZE 240
#elif defined (__powerpc__) || defined (__PPC__) || defined (__ppc__) \
|| defined (__POWERPC__) || defined (PPC) || defined (_IBMR2)
# define SIZE 240
#else
# define SIZE 256
#endif
int foo (void)
{
char arr[SIZE];
arr[0] = 1;
return 0;
}
/* { dg-final { scan-stack-usage "foo\t\(256|264\)\tstatic" } } */
/* { dg-final { cleanup-stack-usage } } */

View File

@ -0,0 +1,20 @@
/* { dg-do compile } */
/* { dg-require-effective-target ilp32 } */
/* { dg-options "-fstack-usage -msse2 -mforce-drap" } */
typedef int __attribute__((vector_size(16))) vec;
vec foo (vec v)
{
return v;
}
int main (void)
{
vec V;
V = foo (V);
return 0;
}
/* { dg-final { scan-stack-usage "main\t48\tdynamic,bounded" } } */
/* { dg-final { cleanup-stack-usage } } */

View File

@ -1,5 +1,5 @@
# Copyright (C) 1997, 1999, 2000, 2003, 2004, 2005, 2006, 2007, 2008, 2009
# Free Software Foundation, Inc.
# Copyright (C) 1997, 1999, 2000, 2003, 2004, 2005, 2006, 2007, 2008, 2009,
# 2010 Free Software Foundation, Inc.
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@ -460,6 +460,11 @@ proc cleanup-ipa-dump { suffix } {
cleanup-dump "\[0-9\]\[0-9\]\[0-9\]i.$suffix"
}
# Remove a stack usage file for the current test.
proc cleanup-stack-usage { args } {
cleanup-dump "su"
}
# Remove all dump files with the provided suffix.
proc cleanup-dump { suffix } {
# This assumes that we are three frames down from dg-test or some other

View File

@ -1,4 +1,5 @@
# Copyright (C) 2000, 2002, 2003, 2007, 2008 Free Software Foundation, Inc.
# Copyright (C) 2000, 2002, 2003, 2007, 2008, 2010
# Free Software Foundation, Inc.
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
@ -154,6 +155,28 @@ proc scan-file-not { output_file args } {
dg-scan "scan-file-not" 0 $testcase $output_file $args
}
# Look for a pattern in the .su file produced by the compiler. See
# dg-scan for details.
proc scan-stack-usage { args } {
upvar 2 name testcase
set testcase [lindex $testcase 0]
set output_file "[file rootname [file tail $testcase]].su"
dg-scan "scan-file" 1 $testcase $output_file $args
}
# Check that a pattern is not present in the .su file produced by the
# compiler. See dg-scan for details.
proc scan-stack-usage-not { args } {
upvar 2 name testcase
set testcase [lindex $testcase 0]
set output_file "[file rootname [file tail $testcase]].su"
dg-scan "scan-file-not" 0 $testcase $output_file $args
}
# Call pass if pattern is present given number of times, otherwise fail.
proc scan-assembler-times { args } {
if { [llength $args] < 2 } {

View File

@ -350,6 +350,7 @@ static const param_info lang_independent_params[] = {
FILE *asm_out_file;
FILE *aux_info_file;
FILE *stack_usage_file = NULL;
FILE *dump_file = NULL;
const char *dump_file_name;
@ -1584,6 +1585,88 @@ alloc_for_identifier_to_locale (size_t len)
return ggc_alloc_atomic (len);
}
/* Output stack usage information. */
void
output_stack_usage (void)
{
static bool warning_issued = false;
enum stack_usage_kind_type { STATIC = 0, DYNAMIC, DYNAMIC_BOUNDED };
const char *stack_usage_kind_str[] = {
"static",
"dynamic",
"dynamic,bounded"
};
HOST_WIDE_INT stack_usage = current_function_static_stack_size;
enum stack_usage_kind_type stack_usage_kind;
expanded_location loc;
const char *raw_id, *id;
if (stack_usage < 0)
{
if (!warning_issued)
{
warning (0, "-fstack-usage not supported for this target");
warning_issued = true;
}
return;
}
stack_usage_kind = STATIC;
/* Add the maximum amount of space pushed onto the stack. */
if (current_function_pushed_stack_size > 0)
{
stack_usage += current_function_pushed_stack_size;
stack_usage_kind = DYNAMIC_BOUNDED;
}
/* Now on to the tricky part: dynamic stack allocation. */
if (current_function_allocates_dynamic_stack_space)
{
if (current_function_has_unbounded_dynamic_stack_size)
stack_usage_kind = DYNAMIC;
else
stack_usage_kind = DYNAMIC_BOUNDED;
/* Add the size even in the unbounded case, this can't hurt. */
stack_usage += current_function_dynamic_stack_size;
}
loc = expand_location (DECL_SOURCE_LOCATION (current_function_decl));
/* Strip the scope prefix if any. */
raw_id = lang_hooks.decl_printable_name (current_function_decl, 2);
id = strrchr (raw_id, '.');
if (id)
id++;
else
id = raw_id;
fprintf (stack_usage_file,
"%s:%d:%d:%s\t"HOST_WIDE_INT_PRINT_DEC"\t%s\n",
basename (loc.file),
loc.line,
loc.column,
id,
stack_usage,
stack_usage_kind_str[stack_usage_kind]);
}
/* Open an auxiliary output file. */
static FILE *
open_auxiliary_file (const char *ext)
{
char *filename;
FILE *file;
filename = concat (aux_base_name, ".", ext, NULL);
file = fopen (filename, "w");
if (!file)
fatal_error ("can't open %s for writing: %m", filename);
free (filename);
return file;
}
/* Initialization of the front end environment, before command line
options are parsed. Signal handlers, internationalization etc.
ARGV0 is main's argv[0]. */
@ -2199,6 +2282,10 @@ lang_dependent_init (const char *name)
init_asm_output (name);
/* If stack usage information is desired, open the output file. */
if (flag_stack_usage)
stack_usage_file = open_auxiliary_file ("su");
/* This creates various _DECL nodes, so needs to be called after the
front end is initialized. */
init_eh ();
@ -2280,6 +2367,9 @@ finalize (void)
unlink_if_ordinary (asm_file_name);
}
if (stack_usage_file)
fclose (stack_usage_file);
statistics_fini ();
finish_optimization_passes ();

View File

@ -522,7 +522,8 @@ struct GTY(()) tree_common {
BLOCK
all decls
CALL_FROM_THUNK_P in
CALL_FROM_THUNK_P and
ALLOCA_FOR_VAR_P in
CALL_EXPR
side_effects_flag:
@ -1329,6 +1330,10 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int,
thunked-to function. */
#define CALL_FROM_THUNK_P(NODE) (CALL_EXPR_CHECK (NODE)->base.protected_flag)
/* In a CALL_EXPR, if the function being called is BUILT_IN_ALLOCA, means that
it has been built for the declaration of a variable-sized object. */
#define ALLOCA_FOR_VAR_P(NODE) (CALL_EXPR_CHECK (NODE)->base.protected_flag)
/* In a type, nonzero means that all objects of the type are guaranteed by the
language or front-end to be properly aligned, so we can indicate that a MEM
of this type is aligned at least to the alignment of the type, even if it