Make the munge_options function round the largest_required_pool_block
value to a multiple of the smallest pool size (currently 8 bytes) to
avoid pools with odd sizes.
Ensure there is a pool large enough for blocks of the requested size.
Previously when largest_required_pool_block was exactly equal to one of
the pool_sizes[] values there would be no pool of that size. This patch
increases _M_npools by one, so there is a pool at least as large as the
requested value. It also reduces the size of the largest pool to be no
larger than needed.
* src/c++17/memory_resource.cc (munge_options): Round up value of
largest_required_pool_block to multiple of smallest pool size. Round
excessively large values down to largest pool size.
(select_num_pools): Increase number of pools by one unless it exactly
matches requested largest_required_pool_block.
(__pool_resource::_M_alloc_pools()): Make largest pool size equal
largest_required_pool_block.
* testsuite/20_util/unsynchronized_pool_resource/options.cc: Check
that pool_options::largest_required_pool_block is set appropriately.
From-SVN: r266089
Since a big_block rounds up the size to a multiple of big_block::min it
is wrong to assert that the supplied number of bytes equals the
big_block's size(). Add big_block::alloc_size(size_t) to calculate the
allocated size consistently, and add comments to the code.
* src/c++17/memory_resource.cc (big_block): Improve comments.
(big_block::all_ones): Remove.
(big_block::big_block(size_t, size_t)): Use alloc_size.
(big_block::size()): Add comment, replace all_ones with equivalent
expression.
(big_block::align()): Shift value of correct type.
(big_block::alloc_size(size_t)): New function to round up size.
(__pool_resource::allocate(size_t, size_t)): Add comment.
(__pool_resource::deallocate(void*, size_t, size_t)): Likewise. Fix
incorrect assertion by using big_block::alloc_size(size_t).
* testsuite/20_util/unsynchronized_pool_resource/allocate.cc: Add
more tests for unpooled allocations.
From-SVN: r266088
* src/c++17/memory_resource.cc (bitset::full()): Handle edge case
for _M_next_word maximum value.
(bitset::get_first_unset(), bitset::set(size_type)): Use
update_next_word() to update _M_next_word.
(bitset::update_next_word()): New function, avoiding wraparound of
unsigned _M_next_word member.
(bitset::max_word_index()): New function.
(chunk::chunk(void*, uint32_t, void*, size_t)): Add assertion.
(chunk::max_bytes_per_chunk()): New function.
(pool::replenish(memory_resource*, const pool_options&)): Prevent
_M_blocks_per_chunk from exceeding max_blocks_per_chunk or from
causing chunk::max_bytes_per_chunk() to be exceeded.
* testsuite/20_util/unsynchronized_pool_resource/allocate-max-chunks.cc:
New test.
From-SVN: r266087
gcc/
PR rtl-optimization/87899
* lra-lives.c (start_living): Update white space in comment.
(enum point_type): New.
(sparseset_contains_pseudos_p): New function.
(update_pseudo_point): Likewise.
(make_hard_regno_live): Use HARD_REGISTER_NUM_P macro.
(make_hard_regno_dead): Likewise. Remove ignore_reg_for_conflicts
handling. Move early exit after adding conflicts.
(mark_pseudo_live): Use HARD_REGISTER_NUM_P macro. Add early exit
if regno is already live. Remove all handling of program points.
(mark_pseudo_dead): Use HARD_REGISTER_NUM_P macro. Add early exit
after adding conflicts. Remove all handling of program points and
ignore_reg_for_conflicts.
(mark_regno_live): Use HARD_REGISTER_NUM_P macro. Remove return value
and do not guard call to mark_pseudo_live.
(mark_regno_dead): Use HARD_REGISTER_NUM_P macro. Remove return value
and do not guard call to mark_pseudo_dead.
(check_pseudos_live_through_calls): Use HARD_REGISTER_NUM_P macro.
(process_bb_lives): Use HARD_REGISTER_NUM_P and HARD_REGISTER_P macros.
Use new function update_pseudo_point. Handle register copies by
removing the source register from the live set. Handle INOUT operands.
Update to the next program point using the unused_set, dead_set and
start_dying sets.
(lra_create_live_ranges_1): Use HARD_REGISTER_NUM_P macro.
From-SVN: r266086
2018-11-13 Richard Biener <rguenther@suse.de>
PR tree-optimization/86991
* tree-vect-loop.c (vect_is_slp_reduction): Delay reduction
group building until we have successfully detected the SLP
reduction.
(vect_is_simple_reduction): Remove fixup code here.
* gcc.dg/pr86991.c: New testcase.
From-SVN: r266081
If called when !dump_enabled_p, the dump_* functions effectively do
nothing, but as of r263178 this doing "nothing" involves non-trivial
work internally.
I wasn't sure whether the dump_* functions should assert that
dump_enabled_p ()
is true when they're called, or if they should bail out immediately
for this case, so in this patch I implemented both, so that we get
an assertion failure, and otherwise bail out for the case where
!dump_enabled_p when assertions are disabled.
The patch also fixes all of the places I found during testing
(on x86_64-pc-linux-gnu) that call into dump_* but which
weren't guarded by
if (dump_enabled_p ())
gcc/ChangeLog:
* dumpfile.c (VERIFY_DUMP_ENABLED_P): New macro.
(dump_gimple_stmt): Use it.
(dump_gimple_stmt_loc): Likewise.
(dump_gimple_expr): Likewise.
(dump_gimple_expr_loc): Likewise.
(dump_generic_expr): Likewise.
(dump_generic_expr_loc): Likewise.
(dump_printf): Likewise.
(dump_printf_loc): Likewise.
(dump_dec): Likewise.
(dump_dec): Likewise.
(dump_hex): Likewise.
(dump_symtab_node): Likewise.
gcc/ChangeLog:
* gimple-loop-interchange.cc (tree_loop_interchange::interchange):
Guard dump call with dump_enabled_p.
* graphite-isl-ast-to-gimple.c (graphite_regenerate_ast_isl): Likewise.
* graphite-optimize-isl.c (optimize_isl): Likewise.
* graphite.c (graphite_transform_loops): Likewise.
* tree-loop-distribution.c (pass_loop_distribution::execute): Likewise.
* tree-parloops.c (parallelize_loops): Likewise.
* tree-ssa-loop-niter.c (number_of_iterations_exit): Likewise.
* tree-vect-data-refs.c (vect_analyze_group_access_1): Likewise.
(vect_prune_runtime_alias_test_list): Likewise.
* tree-vect-loop.c (vect_update_vf_for_slp): Likewise.
(vect_estimate_min_profitable_iters): Likewise.
* tree-vect-slp.c (vect_record_max_nunits): Likewise.
(vect_build_slp_tree_2): Likewise.
(vect_supported_load_permutation_p): Likewise.
(vect_slp_analyze_operations): Likewise.
(vect_slp_analyze_bb_1): Likewise.
(vect_slp_bb): Likewise.
* tree-vect-stmts.c (vect_analyze_stmt): Likewise.
* tree-vectorizer.c (try_vectorize_loop_1): Likewise.
(pass_slp_vectorize::execute): Likewise.
(increase_alignment): Likewise.
From-SVN: r266080
PR ipa/87955 reports a problem I introduced in r265920, where I converted
the guard in report_inline_failed_reason from using:
if (dump_file)
to using
if (dump_enabled_p ()).
without updating the calls to cl_target_option_print_diff and
cl_optimization_print_diff, which assume that dump_file is non-NULL.
The functions are auto-generated. Rather than porting them to the dump
API, this patch applies the workaround of adding the missing checks on
dump_file before calling them.
gcc/ChangeLog:
PR ipa/87955
* ipa-inline.c (report_inline_failed_reason): Guard calls to
cl_target_option_print_diff and cl_optimization_print_diff with
if (dump_file).
gcc/testsuite/ChangeLog:
PR ipa/87955
* gcc.target/i386/pr87955.c: New test.
From-SVN: r266079
gcc/ChangeLog:
* doc/invoke.texi (-fsave-optimization-record): Note that the
output is compressed.
* optinfo-emit-json.cc: Include <zlib.h>.
(optrecord_json_writer::write): Compress the output.
From-SVN: r266078
* tree-vrp.c (value_range_base::dump): Dump type.
Do not use INF nomenclature for 1-bit types.
(dump_value_range): Group all variants to common dumping code.
(debug): New overloaded functions for value_ranges.
(value_range_base::dump): Remove no argument version.
(value_range::dump): Same.
testsuite/
* gcc.dg/tree-ssa/pr64130.c: Adjust for new value_range pretty
printer.
* gcc.dg/tree-ssa/vrp92.c: Same.
From-SVN: r266077
2018-11-13 Richard Biener <rguenther@suse.de>
PR tree-optimization/87931
* tree-vect-loop.c (vect_is_simple_reduction): Restrict
nested cycles we support to latch computations vectorizable_reduction
handles.
* gcc.dg/graphite/pr87931.c: New testcase.
From-SVN: r266075
2018-11-13 Martin Liska <mliska@suse.cz>
PR tree-optimization/87885
* cfghooks.c (account_profile_record): Rename
to ...
(profile_record_check_consistency): ... this.
Calculate missing num_mismatched_freq_in.
(profile_record_account_profile): New function
that calculates time and size of a function.
* cfghooks.h (struct profile_record): Remove
all tuples.
(struct cfg_hooks): Remove after_pass flag.
(account_profile_record): Rename to ...
(profile_record_check_consistency): ... this.
(profile_record_account_profile): New.
* cfgrtl.c (rtl_account_profile_record): Remove
after_pass flag.
* passes.c (check_profile_consistency): Do only
checking.
(account_profile): Calculate size and time of
function only.
(pass_manager::dump_profile_report): Reformat
output.
(execute_one_ipa_transform_pass): Call
consistency check before clean upand call account_profile
after a clean up is done.
(execute_one_pass): Call check_profile_consistency and
account_profile instead of using after_pass flag..
* tree-cfg.c (gimple_account_profile_record): Likewise.
From-SVN: r266074
This patch enables targets to describe DR_TARGET_ALIGNMENT as a compile-time
variable. It does so by turning the variable into a 'poly_uint64'.
gcc/ChangeLog:
2018-11-13 Andre Vieira <andre.simoesdiasvieira@arm.com>
* config/aarch64/aarch64.c
(aarch64_vectorize_preferred_vector_alignment): Change return type to
poly_uint64.
(aarch64_simd_vector_alignment_reachable): Adapt to preferred vector
alignment being a poly int.
* doc/tm.texi (TARGET_VECTORIZE_PREFERRED_VECTOR_ALIGNMENT): Change
return type to poly_uint64.
* target.def (default_preferred_vector_alignment): Likewise.
* targhooks.c (default_preferred_vector_alignment): Likewise.
* targhooks.h (default_preferred_vector_alignment): Likewise.
* tree-vect-data-refs.c (vect_calculate_target_alignment): Likewise.
(vect_compute_data_ref_alignment): Adapt to vector alignment being a
poly int.
(vect_update_misalignment_for_peel): Likewise.
(vect_enhance_data_refs_alignment): Likewise.
(vect_find_same_alignment_drs): Likewise.
(vect_duplicate_ssa_name_ptr_info): Likewise.
(vect_setup_realignment): Likewise.
(vect_can_force_dr_alignment_p): Change alignment parameter type to
poly_uint64.
* tree-vect-loop-manip.c (get_misalign_in_elems): Learn to construct a
mask with a compile time variable vector alignment.
(vect_gen_prolog_loop_niters): Adapt to vector alignment being a poly
int.
(vect_do_peeling): Exit early if vector alignment is not constant.
* tree-vect-stmts.c (ensure_base_align): Adapt to vector alignment being
a poly int.
(vectorizable_store): Likewise.
(vectorizable_load): Likweise.
* tree-vectorizer.h (struct dr_vec_info): Make target_alignment field a
poly_uint64.
(vect_known_alignment_in_bytes): Adapt to vector alignment being a
poly int.
(vect_can_force_dr_alignment_p): Change alignment parameter type to
poly_uint64.
From-SVN: r266072
2018-11-13 Richard Biener <rguenther@suse.de>
PR tree-optimization/87967
* tree-vect-loop.c (vect_transform_loop): Also copy PHIs
for constants for the scalar loop.
* g++.dg/opt/pr87967.C: New testcase.
From-SVN: r266070
For -mcmodel=medium we can use toc-relative addressing to access
constants placed in read-only data, which is better since they can be
merged when in .rodata.cst8.
* config/rs6000/linux64.h (ASM_OUTPUT_SPECIAL_POOL_ENTRY_P): Exclude
integer constants when -mcmodel=medium.
From-SVN: r266069
Avoid emitting lp instruction when in its ZOL body we find a jump
table data in text section. One of the reason is the jump tables size
can be changed latter on, hence the total ZOL length may be wrongly
computed.
gcc/
xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc.c (hwloop_optimize): Bailout when detecting a
jump table data in the text section.
From-SVN: r266067
Our ABI says the blink is pushed first on stack followed by an unknown
number of register saves, and finally by fp. Hence we cannot use the
EH_RETURN_ADDRESS macro as the stack is not finalized at that moment.
The alternative is to use the eh_return pattern and to initialize all
the bits after register allocation when the stack layout is finalized.
gcc/
xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc.c (arc_eh_return_address_location): Repurpose it
to fit the eh_return pattern.
* config/arc/arc.md (eh_return): Define.
(VUNSPEC_ARC_EH_RETURN): Likewise.
* config/arc/arc-protos.h (arc_eh_return_address_location): Match
new implementation.
* config/arc/arc.h (EH_RETURN_HANDLER_RTX): Remove it.
testsuite/
xxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com>
* gcc.target/arc/builtin_eh.c: New test.
From-SVN: r266066
Reimplement how prologue and epilogue is emitted to accomodate
enter/leave instructions, as well as improving the size of the
existing techinques.
The following modifications are added:
- millicode thunk calls can be now selected regardless of the
optimization level. However they are enabled for size optimizations
by default. Also, the millicode optimization is turned off when we
compile for long jumps.
- the compiler is able to use enter/leave instructions for prologue
and epilogue. As these instructions are not ABI compatible we gurad
them under a switch (i.e., -mcode-density-frame). When this option
is on, the compiler will try emitting enter/leave instructions, if
not, then millicode thunk calls (if enabled), and latter the regular
push/pop instructions.
- The prologue/epilogue is now optimized to use pointer walks, hence
improving the chance to have push_s/pop_s instructions emitted. It
also tries to combine the stack adjustments with load/store
operations.
gcc/
xxxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com>
* common/config/arc/arc-common.c (arc_option_optimization_table):
Millicode optimization is default on for size optimizations.
* config/arc/arc-protos.h (arc_check_multi): New function.
* config/arc/arc.c (RTX_OK_FOR_OFFSET_P): Rearange.
(ENTER_LEAVE_START_REG): Define.
(ENTER_LEAVE_END_REG): Likewise.
(arc_override_options): Disable millicode when long calls option
is on.
(arc_frame_info): Change it from int to bool.
(arc_compute_frame_size): Clean up.
(arc_save_restore): Remove.
(frame_save_reg): New function.
(frame_restore_reg): Likewise.
(arc_enter_leave_p): Likewise.
(arc_save_callee_saves): Likewise.
(arc_restore_callee_saves): Likewise.
(arc_save_callee_enter): Likewise.
(arc_restore_callee_leave): Likewise.
(arc_save_callee_milli): Likewise.
(arc_restore_callee_milli): Likewise.
(arc_expand_prologue): Reimplement to emit enter/leave
instructions.
(arc_expand_epilogue): Likewise.
(arc_check_multi): New function.
* config/arc/arc.md (push_multi_fp): New pattern.
(push_multi_fp_blink): Likewise.
(pop_multi_fp): Likewise.
(pop_multi_fp_blink): Likewise.
(pop_multi_fp_ret): Likewise.
(pop_multi_fp_blink_ret): Likewise.
* config/arc/arc.opt (mmillicode): Update option.
(mcode-density-frame): New option.
* config/arc/predicates.md (push_multi_operand): New predicate.
(pop_multi_operand): Likewise.
* doc/invoke.texi (ARC): Update ARC options information.
gcc/testsuite
xxxxx-xx-xx Claudiu Zissulescu <claziss@synopsys.com>
* gcc.target/arc/firq-1.c: Update test.
* gcc.target/arc/firq-3.c: Likewise.
* gcc.target/arc/firq-4.c: Likewise.
* gcc.target/arc/interrupt-6.c: Likewise.
From-SVN: r266065
Simple peephole rules which combines multiple ld/st instructions into
64-bit load/store instructions. It only works for architectures which
are having double load/store option on.
gcc/
Claudiu Zissulescu <claziss@synopsys.com>
* config/arc/arc-protos.h (gen_operands_ldd_std): Add.
* config/arc/arc.c (operands_ok_ldd_std): New function.
(mem_ok_for_ldd_std): Likewise.
(gen_operands_ldd_std): Likewise.
* config/arc/arc.md: Add peephole2 rules for std/ldd.
From-SVN: r266064
* toplev.c (output_stack_usage): Turn test on flag_stack_usage into
test on stack_usage_file.
(lang_dependent_init): Do not open the .su file if generating LTO.
From-SVN: r266063
PR rtl-optimization/87918
* simplify-rtx.c (simplify_merge_mask): For COMPARISON_P, use
simplify_gen_relational rather than simplify_gen_binary.
* gcc.target/i386/pr87918.c: New test.
From-SVN: r266062
2018-11-13 Xianmiao Qu <xianmiao_qu@c-sky.com>
libgcc/
* config/csky/linux-unwind.h (_sig_ucontext_t): Remove.
(csky_fallback_frame_state): Modify the check of the
instructions to adapt to changes in the kernel
From-SVN: r266060
* gcc-interface/misc.c (gnat_init_gcc_eh): Set -fnon-call-exceptions
for the runtime on platforms where System.Machine_Overflow is true.
From-SVN: r266057
When lambdas were added in C++11 they were banned from unevaluated contexts
as a way to avoid needing to deal with them in mangling or SFINAE. This
proposal avoids that with a more narrow proposal: lambdas never compare as
equivalent (so we don't need to mangle them), and substitution failures
within a lambda are hard errors. Lambdas appearing in places that types
couldn't previously have been declared introduces various complications; in
particular, it seems likely to mean types with no linkage being used more
broadly, risking ODR violations. I want to follow up this patch with some
related diagnostics.
* decl2.c (min_vis_expr_r): Handle LAMBDA_EXPR.
* mangle.c (write_expression): Handle LAMBDA_EXPR.
* parser.c (cp_parser_lambda_expression): Allow lambdas in
unevaluated context. Start the tentative firewall sooner.
(cp_parser_lambda_body): Use cp_evaluated.
* pt.c (iterative_hash_template_arg): Handle LAMBDA_EXPR.
(tsubst_function_decl): Substitute a lambda even if it isn't
dependent.
(tsubst_lambda_expr): Use cp_evaluated. Always complain.
(tsubst_copy_and_build) [LAMBDA_EXPR]: Do nothing if tf_partial.
* semantics.c (begin_class_definition): Allow in template parm list.
* tree.c (strip_typedefs_expr): Pass through LAMBDA_EXPR.
(cp_tree_equal): Handle LAMBDA_EXPR.
From-SVN: r266056
Previously, when we got a function template with explicit arguments for all
of the template parameters, we still did "deduction", which of course
couldn't deduce anything, but did other deduction-time checking of
non-dependent conversions and such. This broke down with the unevaluated
lambdas patch (to follow): substituting into the lambda multiple times, once
to get the function type for deduction and then again to generate the actual
decl, doesn't work, since different substitutions of a lambda produce
different types. I believe that skipping the initial substitution when we
have all the arguments is still conformant, and produces better diagnostics
for some testcases.
* pt.c (fn_type_unification): If we have a full set of explicit
arguments, go straight to substitution.
From-SVN: r266055
* decl2.c (min_vis_expr_r, expr_visibility): New.
We weren't properly constraining visibility based on names that appear in
the mangled representation of expressions. This was made more obvious
by the upcoming unevaluated lambdas patch.
(min_vis_r): Call expr_visibility.
(constrain_visibility_for_template): Likewise.
From-SVN: r266054
A destroying operator delete takes responsibility for calling the destructor
for the object it is deleting; this is intended to be useful for sized
delete of a class allocated with a trailing buffer, where the compiler can't
know the size of the allocation, and so would pass the wrong size to the
non-destroying sized operator delete.
gcc/c-family/
* c-cppbuiltin.c (c_cpp_builtins): Define
__cpp_impl_destroying_delete.
gcc/cp/
* call.c (std_destroying_delete_t_p, destroying_delete_p): New.
(aligned_deallocation_fn_p, usual_deallocation_fn_p): Use
destroying_delete_p.
(build_op_delete_call): Handle destroying delete.
* decl2.c (coerce_delete_type): Handle destroying delete.
* init.c (build_delete): Don't call dtor with destroying delete.
* optimize.c (build_delete_destructor_body): Likewise.
libstdc++-v3/
* libsupc++/new (std::destroying_delete_t): New.
From-SVN: r266053
Mostly this was straightforward; the tricky bit was finding, in the
instantiation, the set of capture proxies built when instantiating the
init-capture. The comment in lookup_init_capture_pack goes into detail.
* parser.c (cp_parser_lambda_introducer): Parse pack init-capture.
* pt.c (tsubst_pack_expansion): Handle init-capture packs.
(lookup_init_capture_pack): New.
(tsubst_expr) [DECL_EXPR]: Use it.
(tsubst_lambda_expr): Remember field pack expansions for
init-captures.
From-SVN: r266052
* cp-tree.h (struct cp_evaluated): New.
This patch simplifies the saving/clearing/restoring of
cp_unevaluated_operand and c_inhibit_evaluation_warnings in the presence of
mid-block returns.
* init.c (get_nsdmi): Use it.
* parser.c (cp_parser_enclosed_template_argument_list): Use it.
* pt.c (coerce_template_parms, tsubst_aggr_type): Use it.
From-SVN: r266051
People objected to the old macro name as unclear, so it was changed.
* c-cppbuiltin.c (c_cpp_builtins): Change __cpp_explicit_bool to
__cpp_conditional_explicit.
From-SVN: r266050
This patch removes a call only necessary when using reload. It also
corrects a PRE_DEC address offset.
* config/rs6000/rs6000.c (rs6000_secondary_reload_inner): Negate
offset for PRE_DEC.
(rs6000_secondary_reload_gpr): Don't call find_replacement.
From-SVN: r266049
This catches a few places where move insn patterns don't slightly
disparage CTR, LR and VRSAVE regs. Also fixes the doc for the rs6000
h constraint, and removes an r->cl alternative covered by r->h.
* gcc/doc/md.texi (Machine Constraints): Correct rs6000 h constraint
description.
* config/rs6000/rs6000.md (movsi_internal1): Delete MT%0 case
covered by alternative.
(movcc_internal1): Ignore h for register preference.
(mov<mode>_hardfloat64): Likewise.
(mov<mode>_softfloat): Ignore c, l, h for register preference.
From-SVN: r266044
The Linux kernel requires and emulates LL and SC for the R5900 too. The
special --without-llsc default for the R5900 is therefore not applicable
in that case.
Reviewed-by: Maciej W. Rozycki <macro@linux-mips.org>
2018-11-12 Fredrik Noring <noring@nocrew.org>
gcc/
* config.gcc: Update with-llsc defaults for MIPS r5900.
From-SVN: r266038
2018-11-12 Martin Liska <mliska@suse.cz>
PR target/87903
* doc/extend.texi: Add missing values for __builtin_cpu_is and
__builtin_cpu_supports for x86 target.
From-SVN: r266036