Commit Graph

636 Commits

Author SHA1 Message Date
Tobias Burnus 9967206d41 [og13] OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rect
This is a version of Tobias's mainline patch of the same name,
merged to og13 and with the followup patch "libgomp: cuda.h and
omp_target_memcpy_rect cleanup" folded in.  A couple of merge conflicts
have also been resolved, mostly regarding "gomp_update".  Tobias's
original log message follows.

When copying a 2D or 3D rectangular memmory block, the performance is
better when using CUDA's cuMemcpy2D/cuMemcpy3D instead of copying the
data one by one. That's what this commit does.

Additionally, it permits device-to-device copies, if necessary using a
temporary variable on the host.

2023-09-19  Tobias Burnus  <tobias@codesourcery.com>
	    Julian Brown  <julian@codesourcery.com>

include/
	* cuda/cuda.h (CUlimit): Add CUDA_ERROR_NOT_INITIALIZED,
	CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_INVALID_HANDLE.
	(CUarray, CUmemorytype, CUDA_MEMCPY2D, CUDA_MEMCPY3D,
	CUDA_MEMCPY3D_PEER): New typdefs.
	(cuMemcpyPeer, cuMemcpyPeerAsync, cuMemcpy2D, cuMemcpy2DAsync,
	cuMemcpy2DUnaligned, cuMemcpy3D, cuMemcpy3DAsync, cuMemcpy3DPeer,
	cuMemcpy3DPeerAsync): New prototypes.

libgomp/
	* libgomp-plugin.h (GOMP_OFFLOAD_memcpy2d,
	GOMP_OFFLOAD_memcpy3d): New prototypes.
	* libgomp.h (struct gomp_device_descr): Add memcpy2d_func
	and memcpy3d_func.
	* libgomp.texi (nvptx): Document when cuMemcpy2D/cuMemcpy3D is used.
	* oacc-host.c (memcpy2d_func, .memcpy3d_func): Init with NULL.
	* plugin/cuda-lib.def (cuMemcpy2D, cuMemcpy2DUnaligned,
	cuMemcpy3D): Invoke via CUDA_ONE_CALL.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memcpy2d,
	GOMP_OFFLOAD_memcpy3d): New.
	* target.c (omp_target_memcpy_rect_worker): Update prototype.
	(omp_target_memcpy_rect_check, omp_target_memcpy_rect_copy):
	Permit all device-to-device copies; invoke new plugins for
	2D and 3D copying when available.
	(gomp_update): Update calls to omp_target_memcpy_rect_worker.  Ensure
	that tmp space is not allocated here.
	(gomp_load_plugin_for_device): DLSYM the new plugin functions.
	* testsuite/libgomp.c/target-12.c: Fix dimension bug.
	* testsuite/libgomp.fortran/target-12.f90: Likewise.
	* testsuite/libgomp.fortran/target-memcpy-rect-1.f90: New test.
2023-09-20 11:18:51 +00:00
Andrew Stubbs 0cf5ac34de libgomp: parallel reverse offload
Extend OpenMP reverse offload support to allow running the host kernels
on multiple threads.  The device plugin API for reverse offload is now made
non-blocking, meaning that running the host kernel in the wrong device
context is no longer a problem.  The NVPTX message passing interface now
uses a ring buffer aproximately matching GCN.

include/ChangeLog:

	* gomp-constants.h (GOMP_VERSION): Bump.

libgomp/ChangeLog:

	* config/gcn/target.c (GOMP_target_ext): Add "signal" field.
	Fix atomics race condition.
	* config/nvptx/libgomp-nvptx.h (REV_OFFLOAD_QUEUE_SIZE): New define.
	(struct rev_offload): Implement ring buffer.
	* config/nvptx/target.c (GOMP_target_ext): Likewise.
	* env.c (initialize_env): Read GOMP_REVERSE_OFFLOAD_THREADS.
	* libgomp-plugin.c (GOMP_PLUGIN_target_rev): Replace "aq" parameter
	with "signal" and "use_aq".
	* libgomp-plugin.h (GOMP_PLUGIN_target_rev): Likewise.
	* libgomp.h (gomp_target_rev): Likewise.
	* plugin/plugin-gcn.c (process_reverse_offload): Add "signal".
	(console_output): Pass signal value through.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_openacc_async_construct):
	Attach new threads to the numbered device.
	Change the flag to CU_STREAM_NON_BLOCKING.
	(GOMP_OFFLOAD_run): Implement ring-buffer and remove signalling.
	* target.c (gomp_target_rev): Rename to ...
	(gomp_target_rev_internal): ... this, and change "dev_num" to
	"devicep".
	(gomp_target_rev_worker_thread): New function.
	(gomp_target_rev): New function (old name).
	* libgomp.texi: Document GOMP_REVERSE_OFFLOAD_THREADS.
	* testsuite/libgomp.c/reverse-offload-threads-1.c: New test.
	* testsuite/libgomp.c/reverse-offload-threads-2.c: New test.
2023-09-12 15:06:31 +01:00
Tobias Burnus 58718b8dc2 Merge branch 'releases/gcc-13' into devel/omp/gcc-13
Merge up to r13-7610-gf1d4b4f918a9c5b25a82b7bfccb08aa5e647f41c (27th July 2021)

This is the GCC 13.2.0 release except for the follow-up commit "Bump BASE-VER"
which sets the version to 13.2.1 to denote post-release branch development.
2023-07-27 12:33:47 +02:00
Jakub Jelinek c891d8dc23 Update ChangeLog and version files for release 2023-07-27 08:13:36 +00:00
Julian Brown 6336f8eaeb OpenMP: Support strided and shaped-array updates for C++
This patch adds support for OpenMP 5.0 strided updates and the
array-shaping operator ("([x][y][z]) foo[0:n]...").  This is mostly for
C++ only so far, though necessary changes have been made to the C FE to
adjust for changes to shared data structures.

In terms of the implementation of various bits:

 - The OMP_ARRAY_SECTION tree code has been extended to take a 'stride'
   argument, and changes have been made throughout semantics.cc, etc. to
   take the new field into account -- including bounds checking.

 - A new type of cast operator has been added to represent the OpenMP
   array-shaping operator: OMP_ARRAYSHAPE_CAST_EXPR (1).

 - The address tokenization mechanism from previous patches has been
   extended with two new access kinds to represent noncontiguous array
   updates.

 - New mapping kinds have been added to represent noncontiguous updates:
   those which may be subject to array shaping, or have non-unit strides.
   These are processed by omp-low.cc into a kind of descriptor that is
   passed to the libgomp runtime (2).

The current patch reuses an extended version of the helper code for
omp_target_memcpy_rect, which may generate very many small host-device or
device-host copies.  (The "descriptor" has also been designed so reusing
that functionality is relatively straightforward.)  Optimising those
multiple copies, e.g. by packing them into a single transfer when it
would be beneficial, is left as the subject of a future patch.

This patch has some adjustments to the omp-low.cc code after Chung-Lin's
patch "OpenMP 5.0: Allow multiple clauses mapping same variable"
(325f085897), relative to the version last
posted for mainline.

Notes:

(1) In a bit more detail: the array-shaping operator has the same
precedence as a C-style cast, but applies to the whole expression,
including array-section specifiers. We parse it initially as if it
applies to the "value" of the whole expression:

  ([x][y]) ptr[0:10:2][1:5:2]

i.e., something like:

  ([x][y]) (ptr[0:10:2][1:5:2])

or as if the cast applies to the innermost/right-hand side array
section. Then, a little later in parsing (cp_parser_omp_var_list_no_open),
we rewrite it to apply to the inner pointer instead:

  (([x][y]) ptr)[0:10:2][1:5:2]

and that means a genuine multi-dimensional array or an array-shaped
pointer can be handled pretty much the same for the rest of
compilation. We use VIEW_CONVERT_EXPR for the "cast", unless we're
processing a template definition, where we use a new tree code instead.

(2) The new map kinds work like this. An update directive starts
out with OMP_CLAUSE_TO or OMP_CLAUSE_FROM clauses representing the
block in question and the direction of the needed transfer. If we
detect a noncontiguous update, we emit a list of mapping nodes (type
OMP_CLAUSE_MAP, with new kinds, so the "mapping group" machinery in
gimplify.cc can be reused):

  OMP_CLAUSE_TO -->

  GOMP_MAP_TO_GRID (VIEW_CONVERT_EXPR<int[x][y]>(ptr) [len: <element-size>])
  GOMP_MAP_GRID_DIM 0 [len: 10]   (i.e. [0:10:2])
  GOMP_MAP_GRID_STRIDE 2
  GOMP_MAP_GRID_DIM 1 [len: 5]    (i.e. [1:5:2])
  GOMP_MAP_GRID_STRIDE 2

During omp-low.cc, this sequence is reformulated into:

  GOMP_MAP_TO_GRID (ptr) [len: <whole array size>]
  GOMP_MAP_TO_PSET (&ptr_desc [len: <desc size>])

"ptr_desc" is a struct, stored statically or constructed on the (host)
stack, containing arrays representing the size of the whole array, the
rectangular subregion to transfer, and the stride with which to walk
over elements in each dimension.

2023-07-03  Julian Brown  <julian@codesourcery.com>

gcc/c-family/
	* c-common.h (expand_array_base): Update prototype.
	* c-omp.cc (c_omp_address_inspector::map_supported_p): Support
	VIEW_CONVERT_EXPR and ADDR_EXPR codes.
	(omp_expand_grid_dim): New function.
	(omp_handle_noncontig_array): New function.
	(c_omp_address_inspector:expand_array_base): Remove DECL_P parameter.
	Support noncontiguous array updates.
	(c_omp_address_inspector::expand_component_selector): Support
	noncontiguous array updates.
	(c_omp_address_inspector::expand_map_clause): Update calls to
	expand_array_base.
	* c-pretty-print.cc (c_pretty_printer::postfix_expression): Add
	OMP_ARRAY_SECTION stride support.

gcc/c/
	* c-parser.cc (c_parser_postfix_expression_after_primary): Dummy stride
	support (for now).
	(struct omp_dim): Add stride support.
	(c_parser_omp_variable_list): Likewise.
	* c-tree.h (build_omp_array_section): Update prototype.
	* c-typeck.cc (mark_exp_read): Add stride support for
	OMP_ARRAY_SECTION.
	(build_omp_array_section): Add stride support.
	(handle_omp_array_sections_1): Add minimal stride support.

gcc/cp/
	* cp-objcp-common.cc (cp_common_init_ts): Add array-shape cast
	support.
	* cp-tree.def (OMP_ARRAYSHAPE_CAST_EXPR): Add tree code.
	* cp-tree.h (DECLTYPE_FOR_OMP_ARRAYSHAPE_CAST): Add flag.
	(cp_omp_create_arrayshape_type, cp_build_omp_arrayshape_cast): Add
	prototypes.
	(grok_omp_array_section, build_omp_array_section): Add stride
	parameters.
	* decl.cc (create_anon_array_type): New function.
	(cp_omp_create_arrayshape_type): New function.
	* decl2.cc (grok_omp_array_section): Add stride parameter.
	(min_vis_expr_r): Add OMP_ARRAYSHAPE_CAST_EXPR support.
	* error.cc (dump_expr): Add stride support for OMP_ARRAY_SECTION.
	* mangle.cc (write_expression): Add OMP_ARRAYSHAPE_CAST_EXPR support.
	* operators.def (OMP_ARRAYSHAPE_CAST_EXPR): Add.
	* parser.cc (cp_parser_new): Initialise omp_array_shaping_op_p and
	omp_has_array_shape_p fields.
	(cp_parser_statement_expr): Don't allow array shaping op in statement
	exprs.
	(cp_parser_postfix_open_square_expression): Add stride parsing for
	array sections.  Use array section code to represent array refs if we
	have an array-shaping operator.
	(cp_parser_parenthesized_expression_list): Don't allow array-shaping
	op here.
	(cp_parser_cast_expression): Add array-shaping operator parsing.
	(cp_parser_lambda_expression): Don't allow array-shaping op in lambda
	body.
	(cp_parser_braced_list): Don't allow array-shaping op in braced list.
	(struct omp_dim): Add stride field.
	(cp_parser_var_list_no_open): Add stride/array shape support.
	(cp_parser_omp_target_update): Handle noncontiguous updates.
	* parser.h (cp_parser): Add omp_array_shaping_op_p and
	omp_has_array_shape_p fields.
	* pt.cc (tsubst): Add array-shape cast support.
	(tsubst_copy, tsubst_copy_and_build): Likewise. Add stride support for
	OMP_ARRAY_SECTION.
	(tsubst_omp_clause_decl): Add stride support for OMP_ARRAY_SECTION.
	* semantics.cc (handle_omp_array_sections_1): Add DISCONTIGUOUS
	parameter and stride support.
	(omp_array_section_low_bound): New function.
	(handle_omp_array_sections): Add DISCONTIGUOUS parameter and stride
	support.
	(finish_omp_clauses): Update calls to handle_omp_array_sections, and
	add noncontiguous array update support.
	(cp_build_omp_arrayshape_cast): New function.
	* typeck.cc (structural_comptypes): Add array-shape cast support.
	(build_omp_array_section): Add stride parameter.
	(check_for_casting_away_constness): Add OMP_ARRAYSHAPE_CAST_EXPR
	support.

gcc/
	* gimplify.cc (omp_group_last, omp_group_base): Add GOMP_MAP_TO_GRID,
	GOMP_MAP_FROM_GRID support.
	(gimplify_adjust_omp_clauses): Support new GOMP_MAP_GRID_DIM,
	GOMP_MAP_GRID_STRIDE mapping nodes.  Don't crash on e.g. misuse of
	ADDR_EXPR in mapping clauses.
	* omp-general.cc (omp_parse_noncontiguous_array): New function.
	(omp_parse_access_method): Add noncontiguous array support.
	(omp_parse_structure_base): Add array-shaping support.
	(debug_omp_tokenized_addr): Add ACCESS_NONCONTIG_ARRAY,
	ACCESS_NONCONTIG_REF_TO_ARRAY token support.
	* omp-general.h (access_method_kinds): Add ACCESS_NONCONTIG_ARRAY and
	ACCESS_NONCONTIG_REF_TO_ARRAY access kinds.
	* omp-low.cc (omp_noncontig_descriptor_type): New function.
	(scan_sharing_clauses): Support noncontiguous array updates.
	(lower_omp_target): Likewise.
	* tree-pretty-print.cc (dump_omp_clause): Add GOMP_MAP_TO_GRID,
	GOMP_MAP_FROM_GRID, GOMP_MAP_GRID_DIM, GOMP_MAP_GRID_STRIDE map kinds.
	(dump_generic_node): Add stride support for OMP_ARRAY_SECTION.
	* tree.def (OMP_ARRAY_SECTION): Add stride argument.

include/
	* gomp-constants.h (gomp_map_kind): Add GOMP_MAP_TO_GRID,
	GOMP_MAP_FROM_GRID, GOMP_MAP_GRID_DIM, GOMP_MAP_GRID_STRIDE map kinds.

gcc/testsuite/
	* g++.dg/gomp/array-shaping-1.C: New test.
	* g++.dg/gomp/array-shaping-2.C: New test.
	* g++.dg/gomp/bad-array-shaping-1.C: New test.
	* g++.dg/gomp/bad-array-shaping-2.C: New test.
	* g++.dg/gomp/bad-array-shaping-3.C: New test.
	* g++.dg/gomp/bad-array-shaping-4.C: New test.
	* g++.dg/gomp/bad-array-shaping-5.C: New test.
	* g++.dg/gomp/bad-array-shaping-6.C: New test.
	* g++.dg/gomp/bad-array-shaping-7.C: New test.
	* g++.dg/gomp/bad-array-shaping-8.C: New test.

libgomp/
	* libgomp.h (omp_noncontig_array_desc): New struct.
	* target.c (omp_target_memcpy_rect_worker): Add stride array
	parameter.  Forward declare.  Add STRIDES parameter and strided
	update support.
	(gomp_update): Add noncontiguous (strided/shaped) update support.
	* testsuite/libgomp.c++/array-shaping-1.C: New test.
	* testsuite/libgomp.c++/array-shaping-2.C: New test.
	* testsuite/libgomp.c++/array-shaping-3.C: New test.
	* testsuite/libgomp.c++/array-shaping-4.C: New test.
	* testsuite/libgomp.c++/array-shaping-5.C: New test.
	* testsuite/libgomp.c++/array-shaping-6.C: New test.
	* testsuite/libgomp.c++/array-shaping-7.C: New test.
	* testsuite/libgomp.c++/array-shaping-8.C: New test.
	* testsuite/libgomp.c++/array-shaping-9.C: New test.
	* testsuite/libgomp.c++/array-shaping-10.C: New test.
	* testsuite/libgomp.c++/array-shaping-11.C: New test.
	* testsuite/libgomp.c++/array-shaping-12.C: New test.
	* testsuite/libgomp.c++/array-shaping-13.C: New test.
2023-07-03 21:40:59 +00:00
Julian Brown 6d82b6c415 OpenMP: Fortran "!$omp declare mapper" support
This patch implements "omp declare mapper" functionality for Fortran,
following the equivalent support for C and C++.  This version of the
patch has been merged to og13 and contains various fixes for e.g.:

  * Mappers with deferred-length strings

  * Array descriptors not being appropriately transferred
    to the offload target (see "OMP_MAP_POINTER_ONLY" and
    gimplify.cc:omp_maybe_get_descriptor_from_ptr).

2023-06-30  Julian Brown  <julian@codesourcery.com>

gcc/fortran/
	* dump-parse-tree.cc (show_attr): Show omp_udm_artificial_var flag.
	(show_omp_namelist): Support OMP_MAP_POINTER_ONLY and OMP_MAP_UNSET.
	* f95-lang.cc (LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES,
	LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE,
	LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define language hooks.
	* gfortran.h (gfc_statement): Add ST_OMP_DECLARE_MAPPER.
	(symbol_attribute): Add omp_udm_artificial_var attribute.
	(gfc_omp_map_op): Add OMP_MAP_POINTER_ONLY and OMP_MAP_UNSET.
	(gfc_omp_namelist): Add udm pointer to u2 union.
	(gfc_omp_udm): New struct.
	(gfc_omp_namelist_udm): New struct.
	(gfc_symtree): Add omp_udm pointer.
	(gfc_namespace): Add omp_udm_root symtree. Add omp_udm_ns flag.
	(gfc_free_omp_namelist): Update prototype.
	(gfc_free_omp_udm, gfc_omp_udm_find, gfc_find_omp_udm,
	gfc_resolve_omp_udms): Add prototypes.
	* match.cc (gfc_free_omp_namelist): Change FREE_NS and FREE_ALIGN
	parameters to LIST number, to handle freeing user-defined mapper
	namelists safely.
	* match.h (gfc_match_omp_declare_mapper): Add prototype.
	* module.cc (ab_attribute): Add AB_OMP_DECLARE_MAPPER_VAR.
	(attr_bits): Add OMP_DECLARE_MAPPER_VAR.
	(mio_symbol_attribute): Read/write AB_OMP_DECLARE_MAPPER_VAR attribute.
	Set referenced attr on read.
	(omp_map_clause_ops, omp_map_cardinality): New arrays.
	(load_omp_udms, check_omp_declare_mappers): New functions.
	(read_module): Load and check OMP declare mappers.
	(write_omp_udm, write_omp_udms): New functions.
	(write_module): Write OMP declare mappers.
	* openmp.cc (gfc_free_omp_clauses, gfc_match_omp_variable_list,
	gfc_match_omp_to_link, gfc_match_omp_depend_sink,
	gfc_match_omp_clause_reduction): Update calls to gfc_free_omp_namelist.
	(gfc_free_omp_udm, gfc_find_omp_udm, gfc_omp_udm_find,
	gfc_match_omp_declare_mapper): New functions.
	(gfc_match_omp_clauses): Add DEFAULT_MAP_OP parameter. Update calls to
	gfc_free_omp_namelist.  Add declare mapper support.
	(resolve_omp_clauses): Add declare mapper support.  Update calls to
	gfc_free_omp_namelist.
	(gfc_resolve_omp_udm, gfc_resolve_omp_udms): New functions.
	* parse.cc (decode_omp_directive): Add declare mapper support.
	(case_omp_decl): Add ST_OMP_DECLARE_MAPPER case.
	(gfc_ascii_statement): Add ST_OMP_DECLARE_MAPPER case.
	* resolve.cc (resolve_types): Call gfc_resolve_omp_udms.
	* st.cc (gfc_free_statement): Update call to gfc_free_omp_namelist.
	* symbol.cc (free_omp_udm_tree): New function.
	(gfc_free_namespace): Call above.
	* trans-decl.cc (omp_declare_mapper_ns): New global.
	(gfc_finish_var_decl, gfc_generate_function_code): Support declare
	mappers.
	(gfc_trans_deferred_vars): Ignore artificial declare-mapper vars.
	* trans-openmp.cc (tree-iterator.h): Include.
	(toc_directive): New enum.
	(gfc_trans_omp_array_section): Change OP and OPENMP parameters to
	toc_directive CD ('clause directive').
	(gfc_omp_finish_mapper_clauses, gfc_omp_extract_mapper_directive,
	gfc_omp_map_array_section): New functions.
	(omp_clause_directive): New enum.
	(gfc_trans_omp_clauses): Remove DECLARE_SIMD and OPENACC parameters.
	Replace with toc_directive CD, defaulting to TOC_OPENMP.  Add declare
	mapper support and OMP_MAP_POINTER_ONLY support.
	(gfc_trans_omp_construct, gfc_trans_oacc_executable_directive,
	gfc_trans_oacc_combined_directive): Update calls to
	gfc_trans_omp_clauses.
	(gfc_subst_replace, gfc_subst_prepend_ref): New variables.
	(gfc_subst_in_expr_1, gfc_subst_in_expr, gfc_subst_mapper_var,
	gfc_trans_omp_instantiate_mapper, gfc_trans_omp_instantiate_mappers,
	gfc_record_mapper_bindings_code_fn, gfc_record_mapper_bindings_expr_fn,
	gfc_find_nested_mappers, gfc_record_mapper_bindings): New functions.
	(gfc_typespec * hash traits): New template.
	(omp_declare_mapper_ns): Extern declaration.
	(gfc_trans_omp_target): Call gfc_trans_omp_instantiate_mappers and
	gfc_record_mapper_bindings. Update calls to gfc_trans_omp_clauses.
	(gfc_trans_omp_declare_simd, gfc_trans_omp_declare_variant): Update
	calls to gfc_trans_omp_clauses.
	(gfc_trans_omp_mapper_name, gfc_trans_omp_declare_mapper,
	gfc_trans_omp_declare_mappers): New functions.
	* trans-stmt.h (gfc_trans_omp_declare_mappers): Add prototype.
	* trans.h (gfc_omp_finish_mapper_clauses,
	gfc_omp_extract_mapper_directive, gfc_omp_map_array_section): Add
	prototypes.

gcc/
	* gimplify.cc (dwarf2out.h): Include.
	(omp_maybe_get_descriptor_from_ptr): New function.
	(build_omp_struct_comp_nodes): Use above function to locate array
	descriptor when necessary.
	(omp_mapping_group_data, omp_mapping_group_ptr,
	omp_mapping_group_pset): New functions.
	(omp_instantiate_mapper): Handle inlining of "declare mapper" function
	bodies containing setup code (e.g. for Fortran).  Handle pointers to
	derived types.  Handle GOMP_MAP_MAPPING_GROUPs.
	* tree-pretty-print.cc (dump_omp_clause): Handle
	GOMP_MAP_MAPPING_GROUP.

include/
	* gomp-constants.h (gomp_map_kind): Add GOMP_MAP_MAPPING_GROUP.

gcc/testsuite/
	* gfortran.dg/gomp/declare-mapper-1.f90: New test.
	* gfortran.dg/gomp/declare-mapper-5.f90: New test.
	* gfortran.dg/gomp/declare-mapper-14.f90: New test.

libgomp/
	* testsuite/libgomp.fortran/declare-mapper-2.f90: New test.
	* testsuite/libgomp.fortran/declare-mapper-3.f90: New test.
	* testsuite/libgomp.fortran/declare-mapper-4.f90: New test.
	* testsuite/libgomp.fortran/declare-mapper-6.f90: New test.
	* testsuite/libgomp.fortran/declare-mapper-7.f90: New test.
	* testsuite/libgomp.fortran/declare-mapper-8.f90: New test.
	* testsuite/libgomp.fortran/declare-mapper-9.f90: New test.
	* testsuite/libgomp.fortran/declare-mapper-10.f90: New test.
	* testsuite/libgomp.fortran/declare-mapper-11.f90: New test.
	* testsuite/libgomp.fortran/declare-mapper-12.f90: New test.
	* testsuite/libgomp.fortran/declare-mapper-13.f90: New test.
	* testsuite/libgomp.fortran/declare-mapper-15.f90: New test.
	* testsuite/libgomp.fortran/declare-mapper-17.f90: New test.
	* testsuite/libgomp.fortran/declare-mapper-18.f90: New test.
	* testsuite/libgomp.fortran/declare-mapper-19.f90: New test.
	* testsuite/libgomp.fortran/declare-mapper-20.f90: New test.
	* testsuite/libgomp.fortran/declare-mapper-21.f90: New test.
2023-06-30 19:42:42 +00:00
Julian Brown f3737ba9b8 OpenMP: C++ "declare mapper" support
This patch adds support for OpenMP 5.0 "declare mapper" functionality
for C++.  I've merged it to og13 based on the last version
posted upstream, with some minor changes due to the newly-added
'present' map modifier support.  There's also a fix to splay-tree
traversal in gimplify.cc:omp_instantiate_implicit_mappers, and this patch
omits the rearrangement of gimplify.cc:gimplify_{scan,adjust}_omp_clauses
that I separated out into its own patch and applied (to og13) already.

2023-06-30  Julian Brown  <julian@codesourcery.com>

gcc/c-family/
	* c-common.h (omp_mapper_list): Add forward declaration.
	(c_omp_find_nested_mappers, c_omp_instantiate_mappers): Add prototypes.
	* c-omp.cc (c_omp_find_nested_mappers): New function.
	(remap_mapper_decl_info): New struct.
	(remap_mapper_decl_1, omp_instantiate_mapper,
	c_omp_instantiate_mappers): New functions.

gcc/cp/
	* constexpr.cc (reduced_constant_expression_p): Add OMP_DECLARE_MAPPER
	case.
	(cxx_eval_constant_expression, potential_constant_expression_1):
	Likewise.
	* cp-gimplify.cc (cxx_omp_finish_mapper_clauses): New function.
	* cp-objcp-common.h (LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES,
	LANG_HOOKS_OMP_MAPPER_LOOKUP, LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE,
	LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define langhooks.
	* cp-tree.h (lang_decl_base): Add omp_declare_mapper_p field.  Recount
	spare bits comment.
	(DECL_OMP_DECLARE_MAPPER_P): New macro.
	(omp_mapper_id, cp_check_omp_declare_mapper, omp_instantiate_mappers,
	cxx_omp_finish_mapper_clauses, cxx_omp_mapper_lookup,
	cxx_omp_extract_mapper_directive, cxx_omp_map_array_section: Add
	prototypes.
	* decl.cc (check_initializer): Add OpenMP declare mapper support.
	(cp_finish_decl): Set DECL_INITIAL for OpenMP declare mapper var decls
	as appropriate.
	* decl2.cc (mark_used): Instantiate OpenMP "declare mapper" magic var
	decls.
	* error.cc (dump_omp_declare_mapper): New function.
	(dump_simple_decl): Use above.
	* parser.cc (cp_parser_omp_clause_map): Add KIND parameter.  Support
	"mapper" modifier.
	(cp_parser_omp_all_clauses): Add KIND argument to
	cp_parser_omp_clause_map call.
	(cp_parser_omp_target): Call omp_instantiate_mappers before
	finish_omp_clauses.
	(cp_parser_omp_declare_mapper): New function.
	(cp_parser_omp_declare): Add "declare mapper" support.
	* pt.cc (tsubst_decl): Adjust name of "declare mapper" magic var decls
	once we know their type.
	(tsubst_omp_clauses): Call omp_instantiate_mappers before
	finish_omp_clauses, for target regions.
	(tsubst_expr): Support OMP_DECLARE_MAPPER nodes.
	(instantiate_decl): Instantiate initialiser (i.e definition) for OpenMP
	declare mappers.
	* semantics.cc (gimplify.h): Include.
	(omp_mapper_id, omp_mapper_lookup, omp_extract_mapper_directive,
	cxx_omp_map_array_section, cp_check_omp_declare_mapper): New functions.
	(finish_omp_clauses): Delete GOMP_MAP_PUSH_MAPPER_NAME and
	GOMP_MAP_POP_MAPPER_NAME artificial clauses.
	(omp_target_walk_data): Add MAPPERS field.
	(finish_omp_target_clauses_r): Scan for uses of struct/union/class type
	variables.
	(finish_omp_target_clauses): Create artificial mapper binding clauses
	for used structs/unions/classes in offload region.

gcc/fortran/
	* parse.cc (tree.h, fold-const.h, tree-hash-traits.h): Add includes
	(for additions to omp-general.h).

gcc/
	* gimplify.cc (gimplify_omp_ctx): Add IMPLICIT_MAPPERS field.
	(new_omp_context): Initialise IMPLICIT_MAPPERS hash map.
	(delete_omp_context): Delete IMPLICIT_MAPPERS hash map.
	(instantiate_mapper_info): New structs.
	(remap_mapper_decl_1, omp_mapper_copy_decl, omp_instantiate_mapper,
	omp_instantiate_implicit_mappers): New functions.
	(gimplify_scan_omp_clauses): Handle MAPPER_BINDING clauses.
	(gimplify_adjust_omp_clauses): Instantiate implicit declared mappers.
	(gimplify_omp_declare_mapper): New function.
	(gimplify_expr): Call above function.
	* langhooks-def.h (lhd_omp_finish_mapper_clauses,
	lhd_omp_mapper_lookup, lhd_omp_extract_mapper_directive,
	lhd_omp_map_array_section): Add prototypes.
	(LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES,
	LANG_HOOKS_OMP_MAPPER_LOOKUP, LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE,
	LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define macros.
	(LANG_HOOK_DECLS): Add above macros.
	* langhooks.cc (lhd_omp_finish_mapper_clauses,
	lhd_omp_mapper_lookup, lhd_omp_extract_mapper_directive,
	lhd_omp_map_array_section): New dummy functions.
	* langhooks.h (lang_hooks_for_decls): Add OMP_FINISH_MAPPER_CLAUSES,
	OMP_MAPPER_LOOKUP, OMP_EXTRACT_MAPPER_DIRECTIVE, OMP_MAP_ARRAY_SECTION
	hooks.
	* omp-general.h (omp_name_type<T>): Add templatized struct, hash type
	traits (for omp_name_type<tree> specialization).
	(omp_mapper_list<T>): Add struct.
	* tree-core.h (omp_clause_code): Add OMP_CLAUSE__MAPPER_BINDING_.
	* tree-pretty-print.cc (dump_omp_clause): Support GOMP_MAP_UNSET,
	GOMP_MAP_PUSH_MAPPER_NAME, GOMP_MAP_POP_MAPPER_NAME artificial mapping
	clauses.  Support OMP_CLAUSE__MAPPER_BINDING_ and OMP_DECLARE_MAPPER.
	* tree.cc (omp_clause_num_ops, omp_clause_code_name): Add
	OMP_CLAUSE__MAPPER_BINDING_.
	* tree.def (OMP_DECLARE_MAPPER): New tree code.
	* tree.h (OMP_DECLARE_MAPPER_ID, OMP_DECLARE_MAPPER_DECL,
	OMP_DECLARE_MAPPER_CLAUSES): New defines.
	(OMP_CLAUSE__MAPPER_BINDING__ID, OMP_CLAUSE__MAPPER_BINDING__DECL,
	OMP_CLAUSE__MAPPER_BINDING__MAPPER): New defines.

include/
	* gomp-constants.h (gomp_map_kind): Add GOMP_MAP_UNSET,
	GOMP_MAP_PUSH_MAPPER_NAME, GOMP_MAP_POP_MAPPER_NAME artificial mapping
	clause types.

gcc/testsuite/
	* c-c++-common/gomp/map-6.c: Update error scan output.
	* c-c++-common/gomp/declare-mapper-3.c: New test (only enabled for C++
	for now).
	* c-c++-common/gomp/declare-mapper-4.c: Likewise.
	* c-c++-common/gomp/declare-mapper-5.c: Likewise.
	* c-c++-common/gomp/declare-mapper-6.c: Likewise.
	* c-c++-common/gomp/declare-mapper-7.c: Likewise.
	* c-c++-common/gomp/declare-mapper-8.c: Likewise.
	* c-c++-common/gomp/declare-mapper-9.c: Likewise.
	* c-c++-common/gomp/declare-mapper-12.c: Likewise.
	* g++.dg/gomp/declare-mapper-1.C: New test.
	* g++.dg/gomp/declare-mapper-2.C: New test.

libgomp/
	* testsuite/libgomp.c++/declare-mapper-1.C: New test.
	* testsuite/libgomp.c++/declare-mapper-2.C: New test.
	* testsuite/libgomp.c++/declare-mapper-3.C: New test.
	* testsuite/libgomp.c++/declare-mapper-4.C: New test.
	* testsuite/libgomp.c++/declare-mapper-5.C: New test.
	* testsuite/libgomp.c++/declare-mapper-6.C: New test.
	* testsuite/libgomp.c++/declare-mapper-7.C: New test.
	* testsuite/libgomp.c++/declare-mapper-8.C: New test.
	* testsuite/libgomp.c-c++-common/declare-mapper-9.c: New test (only
	enabled for C++ for now).
	* testsuite/libgomp.c-c++-common/declare-mapper-10.c: Likewise.
	* testsuite/libgomp.c-c++-common/declare-mapper-11.c: Likewise.
	* testsuite/libgomp.c-c++-common/declare-mapper-12.c: Likewise.
	* testsuite/libgomp.c-c++-common/declare-mapper-13.c: Likewise.
	* testsuite/libgomp.c-c++-common/declare-mapper-14.c: Likewise.
2023-06-30 19:35:48 +00:00
Julian Brown 29bc958efc OpenACC: Improve implicit mapping for non-lexically nested offload regions
This patch enables use of the OMP_CLAUSE_RUNTIME_IMPLICIT_P flag for
OpenACC.

This allows code like this to work correctly:

  int arr[100];
  [...]
  #pragma acc enter data copyin(arr[20:10])

  /* No explicit mapping of 'arr' here.  */
  #pragma acc parallel
  { /* use of arr[20:10]... */ }

  #pragma acc exit data copyout(arr[20:10])

Otherwise, the implicit "copy" ("present_or_copy") on the parallel
corresponds to the whole array, and that fails at runtime when the
subarray is mapped.

The numbering of the GOMP_MAP_IMPLICIT bit clashes with the OpenACC
"non-contiguous" dynamic array support, so the GOMP_MAP_NONCONTIG_ARRAY_P
macro has been adjusted to account for that.

This behaviour relates to upstream OpenACC issue 490 (not yet resolved).

2023-06-16  Julian Brown  <julian@codesourcery.com>

gcc/
	* gimplify.cc (gimplify_adjust_omp_clauses_1): Set
	OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P for OpenACC also.

gcc/testsuite/
	* c-c++-common/goacc/combined-reduction.c: Adjust scan output.
	* c-c++-common/goacc/reduction-1.c: Likewise.
	* c-c++-common/goacc/reduction-2.c: Likewise.
	* c-c++-common/goacc/reduction-3.c: Likewise.
	* c-c++-common/goacc/reduction-4.c: Likewise.
	* c-c++-common/goacc/reduction-10.c: Likewise.
	* gfortran.dg/goacc/loop-tree-1.f90: Likewise.

include/
	* gomp-constants.h (GOMP_MAP_NONCONTIG_ARRAY_P): Tweak condition.

libgomp/
	* testsuite/libgomp.oacc-c-c++-common/implicit-mapping-1.c: New test.
2023-06-19 22:15:56 +00:00
Julian Brown f4cd03a9a3 OpenMP/OpenACC: Unordered/non-constant component offset runtime diagnostic
This patch adds support for non-constant component offsets in "map"
clauses for OpenMP (and the equivalants for OpenACC), which are not able
to be sorted into order at compile time.  Normally struct accesses in
such clauses are gathered together and sorted into increasing address
order after a "GOMP_MAP_STRUCT" node: if we have variable indices,
that is no longer possible.

This version of the patch scales back the previously-posted version to
merely add a diagnostic for incorrect usage of component accesses with
variably-indexed arrays of structs: the only permitted variant is where
we have multiple indices that are the same, but we could not prove so
at compile time.  Rather than silently producing the wrong result for
cases where the indices are in fact different, we error out (e.g.,
"map(dtarr(i)%arrptr, dtarr(j)%arrptr(4:8))", for different i/j).

For now, multiple *constant* array indices are still supported (see
map-arrayofstruct-1.c).  That could perhaps be addressed with a follow-up
patch, if necessary.

This version of the patch renumbers the GOMP_MAP_STRUCT_UNORD kind to
avoid clashing with the OpenACC "non-contiguous" dynamic array support.

2023-06-16  Julian Brown  <julian@codesourcery.com>

gcc/fortran/
	* trans-openmp.cc (gfc_omp_deep_map_kind_p): Add GOMP_MAP_STRUCT_UNORD.

gcc/
	* gimplify.cc (extract_base_bit_offset): Add VARIABLE_OFFSET parameter.
	(omp_get_attachment, omp_group_last, omp_group_base,
	omp_directive_maps_explicitly): Add GOMP_MAP_STRUCT_UNORD support.
	(omp_accumulate_sibling_list): Update calls to extract_base_bit_offset.
	Support GOMP_MAP_STRUCT_UNORD.
	(omp_build_struct_sibling_lists, gimplify_scan_omp_clauses,
	gimplify_adjust_omp_clauses, gimplify_omp_target_update): Add
	GOMP_MAP_STRUCT_UNORD support.
	* omp-low.cc (lower_omp_target): Add GOMP_MAP_STRUCT_UNORD support.
	* tree-pretty-print.cc (dump_omp_clause): Likewise.

include/
	* gomp-constants.h (gomp_map_kind): Add GOMP_MAP_STRUCT_UNORD.

libgomp/
	* oacc-mem.c (find_group_last, goacc_enter_data_internal,
	goacc_exit_data_internal, GOACC_enter_exit_data): Add
	GOMP_MAP_STRUCT_UNORD support.
	* target.c (gomp_map_vars_internal): Add GOMP_MAP_STRUCT_UNORD support.
	Detect incorrect use of variable indexing of arrays of structs.
	(GOMP_target_enter_exit_data, gomp_target_task_fn): Add
	GOMP_MAP_STRUCT_UNORD support.
	* testsuite/libgomp.c-c++-common/map-arrayofstruct-1.c: New test.
	* testsuite/libgomp.c-c++-common/map-arrayofstruct-2.c: New test.
	* testsuite/libgomp.c-c++-common/map-arrayofstruct-3.c: New test.
	* testsuite/libgomp.fortran/map-subarray-5.f90: New test.
2023-06-19 22:15:55 +00:00
Tobias Burnus 77e17f596f OpenMP: Cleanups related to the 'present' modifier
Reduce number of enum values passed to libgomp as
GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} have the same semantic as
GOMP_MAP_FORCE_PRESENT (i.e. abort if not present, otherwise ignore);
that's different to GOMP_MAP_ALWAYS_PRESENT_{TO,TOFROM,FROM} which also
abort if not present but copy data when present. This is is a follow-up to
the commit r14-1579-g4ede915d5dde93 done 6 days ago.

Additionally, the commit improves a libgomp run-time and a C/C++ compile-time
error wording and extends testcases a tiny bit.

gcc/c/ChangeLog:

	* c-parser.cc (c_parser_omp_clause_map): Reword error message for
	clearness especially with 'omp target (enter/exit) data.'

gcc/cp/ChangeLog:

	* parser.cc (cp_parser_omp_clause_map): Reword error message for
	clearness especially with 'omp target (enter/exit) data.'
	* semantics.cc (handle_omp_array_sections): Handle
	GOMP_MAP_{ALWAYS_,}PRESENT_{TO,TOFROM,FROM,ALLOC} enum values.

gcc/ChangeLog:

	* gimplify.cc (gimplify_adjust_omp_clauses_1): Use
	GOMP_MAP_FORCE_PRESENT for 'present alloc' implicit mapping.
	(gimplify_adjust_omp_clauses): Change
	GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to the equivalent
	GOMP_MAP_FORCE_PRESENT.
	* omp-low.cc (lower_omp_target): Remove handling of no-longer valid
	GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC}; update map kinds used for
	to/from clauses with present modifier.

include/ChangeLog:

	* gomp-constants.h (enum gomp_map_kind): Change the enum values
	GOMP_MAP_PRESENT_{TO,TOFROM,FROM,ALLOC} to be compiler only.
	(GOMP_MAP_PRESENT_P): Update to include also GOMP_MAP_FORCE_PRESENT.

libgomp/ChangeLog:

	* target.c (gomp_to_device_kind_p, gomp_map_vars_internal): Replace
	GOMP_MAP_PRESENT_{FROM,TO,TOFROM,ACLLOC} by GOMP_MAP_FORCE_PRESENT.
	(gomp_map_vars_internal, gomp_update): Likewise; unify and improve
	error message.
	* testsuite/libgomp.c-c++-common/target-present-2.c: Update for
	changed error message.
	* testsuite/libgomp.fortran/target-present-1.f90: Likewise.
	* testsuite/libgomp.fortran/target-present-2.f90: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/present-1.c: Likewise.
	* testsuite/libgomp.c-c++-common/target-present-1.c: Likewise and
	extend testcase to check that data is copied when needed.
	* testsuite/libgomp.c-c++-common/target-present-3.c: Likewise.
	* testsuite/libgomp.fortran/target-present-3.f90: Likewise.

gcc/testsuite/ChangeLog:

	* c-c++-common/gomp/defaultmap-4.c: Update scan-tree-dump.
	* c-c++-common/gomp/map-9.c: Likewise.
	* gfortran.dg/gomp/defaultmap-8.f90: Likewise.
	* gfortran.dg/gomp/map-11.f90: Likewise.
	* gfortran.dg/gomp/target-update-1.f90: Likewise.
	* gfortran.dg/gomp/map-12.f90: Likewise; also check original dump.
	* c-c++-common/gomp/map-6.c: Update dg-error and also check
	clause error with 'target (enter/exit) data'.

gcc/fortran/ChangeLog.omp:

	* trans-openmp.cc (gfc_omp_deep_map_kind_p): Fix conditions for
	present modifier.

(cherry picked from commit 38944ec2a6)
2023-06-12 18:50:48 +02:00
Kwok Cheung Yeung 167b496415 openmp: Add support for the 'present' modifier
This implements support for the OpenMP 5.1 'present' modifier, which can be
used in map clauses in the 'target', 'target data', 'target data enter' and
'target data exit' constructs, and in the 'to' and 'from' clauses of the
'target update' construct.  It is also supported in defaultmap.

The modifier triggers a fatal runtime error if the data specified by the
clause is not already present on the target device.  It can also be combined
with 'always' in map clauses.

2023-06-06  Kwok Cheung Yeung  <kcy@codesourcery.com>
	    Tobias Burnus  <tobias@codesourcery.com>

gcc/c/
	* c-parser.cc (c_parser_omp_clause_defaultmap,
	c_parser_omp_clause_map): Parse 'present'.
	(c_parser_omp_clause_to, c_parser_omp_clause_from): Remove.
	(c_parser_omp_clause_from_to): New; parse to/from clauses with
	optional present modifer.
	(c_parser_omp_all_clauses): Update call.
	(c_parser_omp_target_data, c_parser_omp_target_enter_data,
	c_parser_omp_target_exit_data): Handle new map enum values
	for 'present' mapping.

gcc/cp/
	* parser.cc (cp_parser_omp_clause_defaultmap,
	cp_parser_omp_clause_map): Parse 'present'.
	(cp_parser_omp_clause_from_to): New; parse to/from
	clauses with optional 'present' modifier.
	(cp_parser_omp_all_clauses): Update call.
	(cp_parser_omp_target_data, cp_parser_omp_target_enter_data,
	cp_parser_omp_target_exit_data): Handle new enum value for
	'present' mapping.
	* semantics.cc (finish_omp_target): Likewise.

gcc/fortran/
	* dump-parse-tree.cc (show_omp_namelist): Display 'present' map
	modifier.
	(show_omp_clauses): Display 'present' motion modifier for 'to'
	and 'from' clauses.

	* gfortran.h (enum gfc_omp_map_op): Add entries with 'present'
	modifiers.
	(struct gfc_omp_namelist): Add 'present_modifer'.
	* openmp.cc (gfc_match_motion_var_list): New, handles optional
	'present' modifier for to/from clauses.
	(gfc_match_omp_clauses): Call it for to/from clauses; parse 'present'
	in defaultmap and map clauses.
	(resolve_omp_clauses): Allow 'present' modifiers on 'target',
	'target data', 'target enter' and 'target exit'	directives.
	* trans-openmp.cc (gfc_trans_omp_clauses): Apply 'present' modifiers
	to tree node for 'map', 'to' and 'from'	clauses.  Apply 'present' for
	defaultmap.

gcc/
	* gimplify.cc (omp_notice_variable): Apply GOVD_MAP_ALLOC_ONLY flag
	and defaultmap flags if the defaultmap has GOVD_MAP_FORCE_PRESENT flag
	set.
	(omp_get_attachment): Handle map clauses with 'present' modifier.
	(omp_group_base): Likewise.
	(gimplify_scan_omp_clauses): Reorder present maps to come first.
	Set GOVD flags for present defaultmaps.
	(gimplify_adjust_omp_clauses_1): Set map kind for present defaultmaps.
	* omp-low.cc (scan_sharing_clauses): Handle 'always, present' map
	clauses.
	(lower_omp_target): Handle map clauses with 'present' modifier.
	Handle 'to' and 'from' clauses with 'present'.
	* tree-core.h (enum omp_clause_defaultmap_kind): Add
	OMP_CLAUSE_DEFAULTMAP_PRESENT defaultmap kind.
	* tree-pretty-print.cc (dump_omp_clause): Handle 'map', 'to' and
	'from' clauses with 'present' modifier.  Handle present defaultmap.
	* tree.h (OMP_CLAUSE_MOTION_PRESENT): New #define.

include/
	* gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_5): New.
	(GOMP_MAP_FLAG_FORCE): Redefine.
	(GOMP_MAP_FLAG_PRESENT, GOMP_MAP_FLAG_ALWAYS_PRESENT): New.
	(enum gomp_map_kind): Add map kinds with 'present' modifiers.
	(GOMP_MAP_COPY_TO_P, GOMP_MAP_COPY_FROM_P): Evaluate to true for
	map variants with 'present'
	(GOMP_MAP_ALWAYS_TO_P, GOMP_MAP_ALWAYS_FROM_P): Evaluate to true
	for map variants with 'always, present' modifiers.
	(GOMP_MAP_ALWAYS): Redefine.
	(GOMP_MAP_FORCE_P, GOMP_MAP_PRESENT_P): New.

libgomp/
	* libgomp.texi (OpenMP 5.1 Impl. status): Set 'present' support for
	defaultmap to 'Y', add 'Y' entry for 'present' on to/from/map clauses.
	* target.c (gomp_to_device_kind_p): Add map kinds with 'present'
	modifier.
	(gomp_map_vars_existing): Use new GOMP_MAP_FORCE_P macro.
	(gomp_map_vars_internal, gomp_update, gomp_target_rev):
	Emit runtime error if memory region not present.
	* testsuite/libgomp.c-c++-common/target-present-1.c: New test.
	* testsuite/libgomp.c-c++-common/target-present-2.c: New test.
	* testsuite/libgomp.c-c++-common/target-present-3.c: New test.
	* testsuite/libgomp.fortran/target-present-1.f90: New test.
	* testsuite/libgomp.fortran/target-present-2.f90: New test.
	* testsuite/libgomp.fortran/target-present-3.f90: New test.

gcc/testsuite/

	* c-c++-common/gomp/map-6.c: Update dg-error, extend to test for
	duplicated 'present' and extend scan-dump tests for 'present'.
	* gfortran.dg/gomp/defaultmap-1.f90: Update dg-error.
	* gfortran.dg/gomp/map-7.f90: Extend parse and dump test for
	'present'.
	* gfortran.dg/gomp/map-8.f90: Extend for duplicate 'present'
	modifier checking.
	* c-c++-common/gomp/defaultmap-4.c: New test.
	* c-c++-common/gomp/map-9.c: New test.
	* c-c++-common/gomp/target-update-1.c: New test.
	* gfortran.dg/gomp/defaultmap-8.f90: New test.
	* gfortran.dg/gomp/map-11.f90: New test.
	* gfortran.dg/gomp/map-12.f90: New test.
	* gfortran.dg/gomp/target-update-1.f90: New test.

Co-Authored-By: Tobias Burnus <tobias@codesourcery.com>

(cherry picked from commit 4ede915d5d)
2023-06-09 12:43:55 +02:00
Tobias Burnus 8341c02b5c Revert "openmp: Add support for the 'present' modifier"
This reverts commit 6e3816fa47.
which then permits to apply the mainline patch of it more cleanly.
2023-06-07 16:49:56 +02:00
Kwok Cheung Yeung 92f6019f10 Merge branch 'releases/gcc-13' into devel/omp/gcc-13 2023-05-22 20:09:16 +01:00
Joseph Myers 83983b4363 Implement LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook [PR109128]
This is one part of the fix for PR109128, along with a corresponding
binutils's linker change.  Without this patch, what happens in the
linker, when an unused object in a .a file has offload data, is that
elf_link_is_defined_archive_symbol calls bfd_link_plugin_object_p,
which ends up calling the plugin's claim_file_handler, which then
records the object as one with offload data. That is, the linker never
decides to use the object in the first place, but use of this _p
interface (called as part of trying to decide whether to use the
object) results in the plugin deciding to use its offload data (and a
consequent mismatch in the offload data present at runtime).

The new hook allows the linker plugin to distinguish calls to
claim_file_handler that know the object is being used by the linker
(from ldmain.c:add_archive_element), from calls that don't know it's
being used by the linker (from elf_link_is_defined_archive_symbol); in
the latter case, the plugin should avoid recording the object as one
with offload data.

	PR middle-end/109128

	include/
	* plugin-api.h (ld_plugin_claim_file_handler_v2)
	(ld_plugin_register_claim_file_v2)
	(LDPT_REGISTER_CLAIM_FILE_HOOK_V2): New.
	(struct ld_plugin_tv): Add tv_register_claim_file_v2.

	lto-plugin/
	* lto-plugin.c (register_claim_file_v2): New.
	(claim_file_handler_v2): New.
	(claim_file_handler): Wrap claim_file_handler_v2.
	(onload): Handle LDPT_REGISTER_CLAIM_FILE_HOOK_V2.

(cherry picked from commit c49d51fa81)
2023-05-19 16:37:17 +01:00
Thomas Schwinge d4c585eeaa '-foffload-memory=pinned' using offloading device interfaces
Implemented for nvptx offloading via 'cuMemHostAlloc', 'cuMemHostRegister'.

	gcc/
	* doc/invoke.texi (-foffload-memory=pinned): Document.
	include/
	* cuda/cuda.h (CUresult): Add
	'CUDA_ERROR_HOST_MEMORY_ALREADY_REGISTERED'.
	(CUdevice_attribute): Add
	'CU_DEVICE_ATTRIBUTE_READ_ONLY_HOST_REGISTER_SUPPORTED'.
	(CU_MEMHOSTREGISTER_READ_ONLY): Add.
	(cuMemHostGetFlags, cuMemHostRegister, cuMemHostUnregister): Add.
	libgomp/
	* libgomp-plugin.h (GOMP_OFFLOAD_page_locked_host_free): Add
	'struct goacc_asyncqueue *' formal parameter.
	(GOMP_OFFLOAD_page_locked_host_register)
	(GOMP_OFFLOAD_page_locked_host_unregister)
	(GOMP_OFFLOAD_page_locked_host_p): Add.
	* libgomp.h (always_pinned_mode)
	(gomp_page_locked_host_register_dev)
	(gomp_page_locked_host_unregister_dev): Add.
	(struct splay_tree_key_s): Add 'page_locked_host_p'.
	(struct gomp_device_descr): Add
	'GOMP_OFFLOAD_page_locked_host_register',
	'GOMP_OFFLOAD_page_locked_host_unregister',
	'GOMP_OFFLOAD_page_locked_host_p'.
	* libgomp.texi (-foffload-memory=pinned): Document.
	* plugin/cuda-lib.def (cuMemHostGetFlags, cuMemHostRegister_v2)
	(cuMemHostRegister, cuMemHostUnregister): Add.
	* plugin/plugin-nvptx.c (struct ptx_device): Add
	'read_only_host_register_supported'.
	(nvptx_open_device): Initialize it.
	(free_host_blocks, free_host_blocks_lock)
	(nvptx_run_deferred_page_locked_host_free)
	(nvptx_page_locked_host_free_callback, nvptx_page_locked_host_p)
	(GOMP_OFFLOAD_page_locked_host_register)
	(nvptx_page_locked_host_unregister_callback)
	(GOMP_OFFLOAD_page_locked_host_unregister)
	(GOMP_OFFLOAD_page_locked_host_p)
	(nvptx_run_deferred_page_locked_host_unregister)
	(nvptx_move_page_locked_host_unregister_blocks_aq1_aq2_callback):
	Add.
	(GOMP_OFFLOAD_fini_device, GOMP_OFFLOAD_page_locked_host_alloc)
	(GOMP_OFFLOAD_run): Call
	'nvptx_run_deferred_page_locked_host_free'.
	(struct goacc_asyncqueue): Add
	'page_locked_host_unregister_blocks_lock',
	'page_locked_host_unregister_blocks'.
	(nvptx_goacc_asyncqueue_construct)
	(nvptx_goacc_asyncqueue_destruct): Handle those.
	(GOMP_OFFLOAD_page_locked_host_free): Handle
	'struct goacc_asyncqueue *' formal parameter.
	(GOMP_OFFLOAD_openacc_async_test)
	(nvptx_goacc_asyncqueue_synchronize): Call
	'nvptx_run_deferred_page_locked_host_unregister'.
	(GOMP_OFFLOAD_openacc_async_serialize): Call
	'nvptx_move_page_locked_host_unregister_blocks_aq1_aq2_callback'.
	* config/linux/allocator.c (linux_memspace_alloc)
	(linux_memspace_calloc, linux_memspace_free)
	(linux_memspace_realloc): Remove 'always_pinned_mode' handling.
	(GOMP_enable_pinned_mode): Move...
	* target.c: ... here.
	(always_pinned_mode, verify_always_pinned_mode)
	(gomp_verify_always_pinned_mode, gomp_page_locked_host_alloc_dev)
	(gomp_page_locked_host_free_dev)
	(gomp_page_locked_host_aligned_alloc_dev)
	(gomp_page_locked_host_aligned_free_dev)
	(gomp_page_locked_host_register_dev)
	(gomp_page_locked_host_unregister_dev): Add.
	(gomp_copy_host2dev, gomp_map_vars_internal)
	(gomp_remove_var_internal, gomp_unmap_vars_internal)
	(get_gomp_offload_icvs, gomp_load_image_to_device)
	(gomp_target_rev, omp_target_memcpy_copy)
	(omp_target_memcpy_rect_worker): Handle 'always_pinned_mode'.
	(gomp_copy_host2dev, gomp_copy_dev2host): Handle
	'verify_always_pinned_mode'.
	(GOMP_target_ext): Add 'assert'.
	(gomp_page_locked_host_alloc): Use
	'gomp_page_locked_host_alloc_dev'.
	(gomp_page_locked_host_free): Use
	'gomp_page_locked_host_free_dev'.
	(omp_target_associate_ptr): Adjust.
	(gomp_load_plugin_for_device): Handle 'page_locked_host_register',
	'page_locked_host_unregister', 'page_locked_host_p'.
	* oacc-mem.c (memcpy_tofrom_device): Handle 'always_pinned_mode'.
	* libgomp_g.h (GOMP_enable_pinned_mode): Adjust.
	* testsuite/libgomp.c/alloc-pinned-7.c: Remove.
2023-05-18 16:11:56 +01:00
Thomas Schwinge c9e88c6906 Attempt to not just register but allocate OpenMP pinned memory using a device
... instead of 'mmap' plus attempting to register using a device.

Implemented for nvptx offloading via 'cuMemHostAlloc'.

This re-works og12 commit a5a4800e92
"Attempt to register OpenMP pinned memory using a device instead of 'mlock'".

	include/
	* cuda/cuda.h (cuMemHostRegister, cuMemHostUnregister): Remove.
	libgomp/
	* config/linux/allocator.c (linux_memspace_alloc): Add 'init0'
	formal parameter.  Adjust all users.
	(linux_memspace_alloc, linux_memspace_free): Attempt to allocate
	OpenMP pinned memory using a device instead of 'mmap' plus
	attempting to register using a device.
	* libgomp-plugin.h (GOMP_OFFLOAD_register_page_locked)
	(GOMP_OFFLOAD_unregister_page_locked): Remove.
	(GOMP_OFFLOAD_page_locked_host_alloc)
	(GOMP_OFFLOAD_page_locked_host_free): New.
	* libgomp.h (gomp_register_page_locked)
	(gomp_unregister_page_locked): Remove.
	(gomp_page_locked_host_alloc, gomp_page_locked_host_free): New.
	(struct gomp_device_descr): Remove 'register_page_locked_func',
	'unregister_page_locked_func'.  Add 'page_locked_host_alloc_func',
	'page_locked_host_free_func'.
	* plugin/cuda-lib.def (cuMemHostRegister_v2, cuMemHostRegister)
	(cuMemHostUnregister): Remove.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_register_page_locked)
	(GOMP_OFFLOAD_unregister_page_locked): Remove.
	(GOMP_OFFLOAD_page_locked_host_alloc)
	(GOMP_OFFLOAD_page_locked_host_free): New.
	* target.c (gomp_register_page_locked)
	(gomp_unregister_page_locked): Remove.
	(gomp_page_locked_host_alloc, gomp_page_locked_host_free): Add.
	(gomp_load_plugin_for_device): Don't handle
	'register_page_locked', 'unregister_page_locked'.  Handle
	'page_locked_host_alloc', 'page_locked_host_free'.

Suggested-by: Andrew Stubbs <ams@codesourcery.com>
2023-05-18 16:11:54 +01:00
Thomas Schwinge 23fa6a8ab0 Attempt to register OpenMP pinned memory using a device instead of 'mlock'
Implemented for nvptx offloading via 'cuMemHostRegister'.  This means: (a) not
running into 'mlock' limitations, and (b) the device is aware of this and may
optimize host <-> device memory transfers.

This re-works og12 commit ab7520b3b4
"libgomp: pinned memory".

	include/
	* cuda/cuda.h (cuMemHostRegister, cuMemHostUnregister): New.
	libgomp/
	* config/linux/allocator.c (linux_memspace_alloc)
	(linux_memspace_free, linux_memspace_realloc): Attempt to register
	OpenMP pinned memory using a device instead of 'mlock'.
	* libgomp-plugin.h (GOMP_OFFLOAD_register_page_locked)
	(GOMP_OFFLOAD_unregister_page_locked): New.
	* libgomp.h (gomp_register_page_locked)
	(gomp_unregister_page_locked): New
	(struct gomp_device_descr): Add 'register_page_locked_func',
	'unregister_page_locked_func'.
	* plugin/cuda-lib.def (cuMemHostRegister_v2, cuMemHostRegister)
	(cuMemHostUnregister): New.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_register_page_locked)
	(GOMP_OFFLOAD_unregister_page_locked): New.
	* target.c (gomp_register_page_locked)
	(gomp_unregister_page_locked): New.
	(gomp_load_plugin_for_device): Handle 'register_page_locked',
	'unregister_page_locked'.
	* testsuite/libgomp.c/alloc-pinned-1.c: Adjust.
	* testsuite/libgomp.c/alloc-pinned-2.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-3.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-4.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-5.c: Likewise.
	* testsuite/libgomp.c/alloc-pinned-6.c: Likewise.
2023-05-18 16:11:54 +01:00
Kwok Cheung Yeung 6e3816fa47 openmp: Add support for the 'present' modifier
This implements support for the OpenMP 5.1 'present' modifier, which can be
used in map clauses in the 'target', 'target data', 'target data enter' and
'target data exit' constructs, and in the 'to' and 'from' clauses of the
'target update' construct.  It is also supported in defaultmap.

The modifier triggers a fatal runtime error if the data specified by the
clause is not already present on the target device.  It can also be combined
with 'always' in map clauses.

2023-02-01  Kwok Cheung Yeung  <kcy@codesourcery.com>

	gcc/c/
	* c-parser.cc (c_parser_omp_variable_list): Set default motion
	modifier.
	(c_parser_omp_var_list_parens): Add new parameter with default.  Parse
	'present' motion modifier and apply.
	(c_parser_omp_clause_defaultmap): Parse 'present' in defaultmap.
	(c_parser_omp_clause_map): Parse 'present' modifier in map clauses.
	(c_parser_omp_clause_to): Allow use of 'present' in variable list.
	(c_parser_omp_clause_from): Likewise.
	(c_parser_omp_target_data): Allow map clauses with 'present'
	modifiers.
	(c_parser_omp_target_enter_data): Likewise.
	(c_parser_omp_target_exit_data): Likewise.
	(c_parser_omp_target): Likewise.

	gcc/cp/
	* parser.cc (cp_parser_omp_var_list_no_open): Add new parameter with
	default.  Parse	'present' motion modifier and apply.
	(cp_parser_omp_clause_defaultmap): Parse 'present' in defaultmap.
	(cp_parser_omp_clause_map): Parse 'present' modifier in map clauses.
	(cp_parser_omp_all_clauses): Allow use of 'present' in 'to' and 'from'
	clauses.
	(cp_parser_omp_target_data): Allow map clauses with 'present'
	modifiers.
	(cp_parser_omp_target_enter_data): Likewise.
	(cp_parser_omp_target_exit_data): Likewise.
	* semantics.cc (finish_omp_target): Accept map clauses with 'present'
	modifiers.

	gcc/fortran/
	* dump-parse-tree.cc (show_omp_namelist): Display 'present' map
	modifier.
	(show_omp_clauses): Display 'present' motion modifier for 'to'
	and 'from' clauses.
	* gfortran.h (enum gfc_omp_map_op): Add entries with 'present'
	modifiers.
	(enum gfc_omp_motion_modifier): New.
	(struct gfc_omp_namelist): Add motion_modifier field.
	* openmp.cc (gfc_match_omp_variable_list): Add new parameter with
	default.  Parse 'present' motion modifier and apply.
	(gfc_match_omp_clauses): Parse 'present' in defaultmap, 'from'
	clauses, 'map' clauses and 'to' clauses.
	(resolve_omp_clauses): Allow 'present' modifiers on 'target',
	'target data', 'target enter' and 'target exit'	directives.
	* trans-openmp.cc (gfc_omp_deep_map_kind_p): Handle map kinds with
	'present' modifier.
	(gfc_trans_omp_clauses): Apply 'present' modifiers to tree node for
	'map', 'to' and 'from'	clauses.  Apply 'present' for defaultmap.

	gcc/
	* gimplify.cc (omp_notice_variable): Apply GOVD_MAP_ALLOC_ONLY flag
	and defaultmap flags if the defaultmap has GOVD_MAP_FORCE_PRESENT flag
	set.
	(omp_get_attachment): Handle map clauses with 'present' modifier.
	(omp_group_base): Likewise.
	(gimplify_scan_omp_clauses): Reorder present maps to come first.
	Set GOVD flags for present defaultmaps.
	(gimplify_adjust_omp_clauses_1): Set map kind for present defaultmaps.
	* omp-low.cc (scan_sharing_clauses): Handle 'always, present' map
	clauses.
	(lower_omp_target): Handle map clauses with 'present' modifier.
	Handle 'to' and 'from' clauses with 'present'.
	* tree-core.h (enum omp_clause_defaultmap_kind): Add
	OMP_CLAUSE_DEFAULTMAP_PRESENT defaultmap kind.
	(enum omp_clause_motion_modifier): New.
	(struct tree_omp_clause): Add motion_modifier field.
	* tree-pretty-print.cc (dump_omp_clause): Handle 'map', 'to' and
	'from' clauses with 'present' modifier.  Handle present defaultmap.
	* tree.h (OMP_CLAUSE_MOTION_MODIFIER): New.
	(OMP_CLAUSE_SET_MOTION_MODIFIER): New.

	gcc/testsuite/
	* c-c++-common/gomp/defaultmap-4.c: New.
	* c-c++-common/gomp/map-6.c: Update expected error messages.
	* c-c++-common/gomp/map-9.c: New.
	* c-c++-common/gomp/target-update-1.c: New.
	* gfortran.dg/gomp/defaultmap-1.f90: Update expected error messages.
	* gfortran.dg/gomp/defaultmap-8.f90: New.
	* gfortran.dg/gomp/map-10.f90: New.
	* gfortran.dg/gomp/target-update-1.f90: New.

	include/
	* gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_5): New.
	(GOMP_MAP_FLAG_FORCE): Redefine.
	(GOMP_MAP_FLAG_PRESENT): New.
	(GOMP_MAP_FLAG_ALWAYS_PRESENT): New.
	(enum gomp_map_kind): Add map kinds with 'present' modifiers.
	(GOMP_MAP_COPY_TO_P): Evaluate to true for map variants with 'present'
	modifiers.
	(GOMP_MAP_COPY_FROM_P): Likewise.
	(GOMP_MAP_ALWAYS_TO_P): Evaluate to true for map variants with
	'always, present' modifiers.
	(GOMP_MAP_ALWAYS_FROM_P): Likewise.
	(GOMP_MAP_ALWAYS): Redefine.
	(GOMP_MAP_FORCE_P): New.
	(GOMP_MAP_PRESENT_P): New.

	libgomp/
	* target.c (gomp_to_device_kind_p): Add map kinds with 'present'
	modifier.
	(gomp_map_vars_existing): Use new GOMP_MAP_FORCE_P macro.
	(gomp_map_vars_internal): Emit runtime error if memory region not
	present.
	(gomp_update): Likewise.
	(gomp_target_rev): Likewise.
	* testsuite/libgomp.c-c++-common/target-present-1.c: New.
	* testsuite/libgomp.c-c++-common/target-present-2.c: New.
	* testsuite/libgomp.c-c++-common/target-present-3.c: New.
	* testsuite/libgomp.fortran/target-present-1.f90: New.
	* testsuite/libgomp.fortran/target-present-2.f90: New.
	* testsuite/libgomp.fortran/target-present-3.f90: New.

Add 'present' map types to gfc_omp_deep_map_kind_p
2023-05-18 16:11:54 +01:00
Kwok Cheung Yeung a56cb34f65 libgomp, nvptx: Update bundled CUDA header file
This updates the bundled cuda.h header file to include some new API calls and
constants that are now used in the code.

This patch should be included when the "libgomp, nvptx: low-latency memory
allocator" or "openmp: Add support for 'target_device' context selector set"
patches are upstreamed.

2022-06-21  Kwok Cheung Yeung  <kcy@codesourcery.com>

	include/
	* cuda/cuda.h (CUdevice_attribute): Add definitions for
	CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MAJOR and
	CU_DEVICE_ATTRIBUTE_COMPUTE_CAPABILITY_MINOR.
	(CUmemAttach_flags): New.
	(CUpointer_attribute): New.
	(cuMemAllocManaged): New prototype.
	(cuPointerGetAttribute): New prototype.

	libgomp/
	* plugin/cuda-lib.def (cuMemAllocManaged): Add new call.
	(cuPointerGetAttribute): Likewise.
2023-05-16 19:14:26 +01:00
Julian Brown 44d3956596 Fortran "declare create"/allocate support for OpenACC
2018-10-04  Cesar Philippidis  <cesar@codesourcery.com>
            Julian Brown  <julian@codesourcery.com>

	gcc/
	* omp-low.cc (scan_sharing_clauses): Update handling of OpenACC declare
	create, declare copyin and declare deviceptr to have local lifetimes.
	(convert_to_firstprivate_int): Handle pointer types.
	(convert_from_firstprivate_int): Likewise.  Create local storage for
	the values being pointed to.  Add new orig_type argument.
	(lower_omp_target): Handle GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
	Add orig_type argument to convert_from_firstprivate_int call.
	Allow pointer types with GOMP_MAP_FIRSTPRIVATE_INT.  Don't privatize
	firstprivate VLAs.
	* tree-pretty-print.cc (dump_omp_clause): Handle
	GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.

	gcc/fortran/
	* gfortran.h (enum gfc_omp_map_op): Add OMP_MAP_DECLARE_ALLOCATE,
	OMP_MAP_DECLARE_DEALLOCATE.
	(gfc_omp_clauses): Add update_allocatable.
	* trans-array.cc (gfc_array_allocate): Call
	gfc_trans_oacc_declare_allocate for decls that have oacc_declare_create
	attribute set.
	* trans-decl.cc (find_module_oacc_declare_clauses): Relax
	oacc_declare_create to OMP_MAP_ALLOC, and oacc_declare_copyin to
	OMP_MAP_TO, in order to	match OpenACC 2.5 semantics.
	* trans-openmp.cc (gfc_trans_omp_clauses): Use GOMP_MAP_ALWAYS_POINTER
	(for update directive) or GOMP_MAP_FIRSTPRIVATE_POINTER (otherwise) for
	allocatable scalar decls.  Handle OMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}
	clauses.
	(gfc_trans_oacc_executable_directive): Use GOMP_MAP_ALWAYS_POINTER
	for allocatable scalar data clauses inside acc update directives.
	(gfc_trans_oacc_declare_allocate): New function.
	* trans-stmt.cc (gfc_trans_allocate): Call
	gfc_trans_oacc_declare_allocate for decls with oacc_declare_create
	attribute set.
	(gfc_trans_deallocate): Likewise.
	* trans.h (gfc_trans_oacc_declare_allocate): Declare.

	gcc/testsuite/
	* gfortran.dg/goacc/declare-allocatable-1.f90: New test.

	include/
	* gomp-constants.h (enum gomp_map_kind): Define
	GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE} and GOMP_MAP_FLAG_SPECIAL_4.

	libgomp/
	* oacc-mem.c (gomp_acc_declare_allocate): New function.
	* oacc-parallel.c (GOACC_enter_exit_data): Handle
	GOMP_MAP_DECLARE_{ALLOCATE,DEALLOCATE}.
	* testsuite/libgomp.oacc-fortran/allocatable-scalar.f90: New test.
	* testsuite/libgomp.oacc-fortran/declare-allocatable-2.f90: New test.
	* testsuite/libgomp.oacc-fortran/declare-allocatable-3.f90: New test.
	* testsuite/libgomp.oacc-fortran/declare-allocatable-4.f90: New test.

2020-02-19  Julian Brown  <julian@codesourcery.com>

	gcc/fortran/
	* trans-openmp.cc (gfc_omp_check_optional_argument): Handle non-decl
	case.

	gcc/
	* gimplify.cc (gimplify_scan_omp_clauses): Handle
	GOMP_MAP_DECLARE_ALLOCATE and GOMP_MAP_DECLARE_DEALLOCATE.

	libgomp/
	* libgomp.h (gomp_acc_declare_allocate): Remove prototype.
	* oacc-mem.c (gomp_acc_declare_allocate): Make static.  Add POINTER
	argument. Use acc_delete instead of acc_free.  Handle scalar
	mappings.
	(find_group_last): Handle GOMP_MAP_DECLARE_ALLOCATE and
	GOMP_MAP_DECLARE_DEALLOCATE groupings.
	(goacc_enter_data_internal): Fix kind check for
	GOMP_MAP_DECLARE_ALLOCATE. Pass new pointer argument to
	gomp_acc_declare_allocate.
	(goacc_exit_data_internal): Unlock device mutex around
	gomp_acc_declare_allocate call. Pass new pointer argument. Handle
	group pointer mapping for deallocate.

2021-04-07  Kwok Cheung Yeung  <kcy@codesourcery.com>

	libgomp/
	* oacc-mem.c (goacc_enter_data_internal): Unlock mutex before calling
	gomp_acc_declare_allocate and relock it afterwards.
2023-05-12 19:13:45 +01:00
Chung-Lin Tang c329847278 Merge non-contiguous array support patches.
This version is based from v4, posted upstream here:
https://gcc.gnu.org/pipermail/gcc-patches/2020-April/543437.html

2020-04-19  Chung-Lin Tang  <cltang@codesourcery.com>

	PR other/76739

	gcc/c/
	* c-typeck.cc (handle_omp_array_sections_1): Add 'bool &non_contiguous'
	parameter, adjust recursive call site, add cases for allowing
	pointer based multi-dimensional arrays for OpenACC.
	(handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
	handle non-contiguous case to create dynamic array map.

	gcc/cp/
	* semantics.cc (handle_omp_array_sections_1): Add 'bool &non_contiguous'
	parameter, adjust recursive call site, add cases for allowing
	pointer based multi-dimensional arrays for OpenACC.
	(handle_omp_array_sections): Adjust handle_omp_array_sections_1 call,
	handle non-contiguous case to create dynamic array map.

	gcc/fortran/
	* f95-lang.cc (DEF_FUNCTION_TYPE_VAR_5): New symbol.
	* types.def (BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_VAR): New type.

	gcc/
	* builtin-types.def (BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_VAR): New type.
	* omp-builtins.def (BUILT_IN_GOACC_DATA_START): Adjust function type
	to new BT_FN_VOID_INT_SIZE_PTR_PTR_PTR_VAR.
	* gimplify.cc (gimplify_scan_omp_clauses): Skip gimplification of
	OMP_CLAUSE_SIZE of non-contiguous array maps (which is a TREE_LIST).
	* omp-expand.cc (expand_omp_target): Add non-contiguous array descriptor
	pointers to variadic arguments.
	* omp-low.cc (append_field_to_record_type): New function.
	(create_noncontig_array_descr_type): Likewise.
	(create_noncontig_array_descr_init_code): Likewise.
	(scan_sharing_clauses): For non-contiguous array map kinds, check for
	supported dimension structure, and install non-contiguous array
	variable into current omp_context.
	(reorder_noncontig_array_clauses): New function.
	(scan_omp_target): Call reorder_noncontig_array_clauses to place
	non-contiguous array map clauses at beginning of clause sequence.
	(lower_omp_target): Add handling for non-contiguous array map kinds,
	add all created non-contiguous array descriptors to
	gimple_omp_target_data_arg.

	gcc/testsuite/
	* c-c++-common/goacc/noncontig_array-1.c: New test.

	libgomp/
	* libgomp_g.h (GOACC_data_start): Add variadic '...' to declaration.
	* libgomp.h (gomp_map_vars_openacc): New function declaration.
	* oacc-int.h (struct goacc_ncarray_dim): New struct declaration.
	(struct goacc_ncarray_descr_type): Likewise.
	(struct goacc_ncarray): Likewise.
	(struct goacc_ncarray_info): Likewise.
	(goacc_noncontig_array_create_ptrblock): New function declaration.
	* oacc-parallel.c (goacc_noncontig_array_count_rows): New function.
	(goacc_noncontig_array_compute_sizes): Likewise.
	(goacc_noncontig_array_fill_rows_1): Likewise.
	(goacc_noncontig_array_fill_rows): Likewise.
	(goacc_process_noncontiguous_arrays): Likewise.
	(goacc_noncontig_array_create_ptrblock): Likewise.
	(GOACC_parallel_keyed): Use goacc_process_noncontiguous_arrays to
	handle non-contiguous array descriptors at end of varargs, adjust
	to use gomp_map_vars_openacc.
	(GOACC_data_start): Likewise. Adjust function type to accept varargs.
	* target.c (gomp_map_vars_internal): Add struct goacc_ncarray_info *
	nca_info parameter, add handling code for non-contiguous arrays.
	(gomp_map_vars_openacc): Add new function for specialization of
	gomp_map_vars_internal for OpenACC structured region usage.
	* testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h: Support
	header for new tests.

	include/
	* gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_3): Define.
	(enum gomp_map_kind): Add GOMP_MAP_NONCONTIG_ARRAY,
	GOMP_MAP_NONCONTIG_ARRAY_TO, GOMP_MAP_NONCONTIG_ARRAY_FROM,
	GOMP_MAP_NONCONTIG_ARRAY_TOFROM, GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO,
	GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM, GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM,
	GOMP_MAP_NONCONTIG_ARRAY_ALLOC, GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC,
	GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT.
	(GOMP_MAP_NONCONTIG_ARRAY_P): Define.
2023-05-12 19:13:43 +01:00
Jakub Jelinek cc035c5d86 Update ChangeLog and version files for release 2023-04-26 07:10:03 +00:00
GCC Administrator bf0e0fc0ea Daily bump. 2023-02-28 00:18:40 +00:00
Max Filippov 999b7aab21 gcc: xtensa: add XCHAL_HAVE_{CLAMPS,DEPBITS,EXCLUSIVE,XEA3} to dynconfig
gcc/
	* config/xtensa/xtensa-dynconfig.cc (xtensa_get_config_v2)
	(xtensa_get_config_v3): New functions.

include/
	* xtensa-dynconfig.h (xtensa_config_v3): New struct.
	(xtensa_get_config_v3): New declaration.
	(XCHAL_HAVE_CLAMPS, XCHAL_HAVE_DEPBITS, XCHAL_HAVE_EXCLUSIVE)
	(XCHAL_HAVE_XEA3, XTENSA_CONFIG_V3_ENTRY_LIST): New definitions.
	(XTENSA_CONFIG_INSTANCE_LIST): Add xtensa_config_v3 instance.
	(XTENSA_CONFIG_ENTRY_LIST): Add XTENSA_CONFIG_V3_ENTRY_LIST.
2023-02-27 04:03:33 -08:00
GCC Administrator a37a0cb303 Daily bump. 2023-02-03 00:16:44 +00:00
Andrew Stubbs f6fff8a6fc amdgcn, libgomp: Manually allocated stacks
Switch from using stacks in the "private segment" to using a memory block
allocated on the host side.  The primary reason is to permit the reverse
offload implementation to access values located on the device stack, but
there may also be performance benefits, especially with repeated kernel
invocations.

This implementation unifies the stacks with the "team arena" optimization
feature, and now allows both to have run-time configurable sizes.

A new ABI is needed, so all libraries must be rebuilt, and newlib must be
version 4.3.0.20230120 or newer.

gcc/ChangeLog:

	* config/gcn/gcn-run.cc: Include libgomp-gcn.h.
	(struct kernargs): Replace the common content with kernargs_abi.
	(struct heap): Delete.
	(main): Read GCN_STACK_SIZE envvar.
	Allocate space for the device stacks.
	Write the new kernargs fields.
	* config/gcn/gcn.cc (gcn_option_override): Remove stack_size_opt.
	(default_requested_args): Remove PRIVATE_SEGMENT_BUFFER_ARG and
	PRIVATE_SEGMENT_WAVE_OFFSET_ARG.
	(gcn_addr_space_convert): Mask the QUEUE_PTR_ARG content.
	(gcn_expand_prologue): Move the TARGET_PACKED_WORK_ITEMS to the top.
	Set up the stacks from the values in the kernargs, not private.
	(gcn_expand_builtin_1): Match the stack configuration in the prologue.
	(gcn_hsa_declare_function_name): Turn off the private segment.
	(gcn_conditional_register_usage): Ensure QUEUE_PTR is fixed.
	* config/gcn/gcn.h (FIXED_REGISTERS): Fix the QUEUE_PTR register.
	* config/gcn/gcn.opt (mstack-size): Change the description.

include/ChangeLog:

	* gomp-constants.h (GOMP_VERSION_GCN): Bump.

libgomp/ChangeLog:

	* config/gcn/libgomp-gcn.h (DEFAULT_GCN_STACK_SIZE): New define.
	(DEFAULT_TEAM_ARENA_SIZE): New define.
	(struct heap): Move to this file.
	(struct kernargs_abi): Likewise.
	* config/gcn/team.c (gomp_gcn_enter_kernel): Use team arena size from
	the kernargs.
	* libgomp.h: Include libgomp-gcn.h.
	(TEAM_ARENA_SIZE): Remove.
	(team_malloc): Update the error message.
	* plugin/plugin-gcn.c (struct kernargs): Move common content to
	struct kernargs_abi.
	(struct agent_info): Rename team arenas to ephemeral memories.
	(struct team_arena_list): Rename ....
	(struct ephemeral_memories_list): to this.
	(struct heap): Delete.
	(team_arena_size): New variable.
	(stack_size): New variable.
	(print_kernel_dispatch): Update debug messages.
	(init_environment_variables): Read GCN_TEAM_ARENA_SIZE.
	Read GCN_STACK_SIZE.
	(get_team_arena): Rename ...
	(configure_ephemeral_memories): ... to this, and set up stacks.
	(release_team_arena): Rename ...
	(release_ephemeral_memories): ... to this.
	(destroy_team_arenas): Rename ...
	(destroy_ephemeral_memories): ... to this.
	(create_kernel_dispatch): Add num_threads parameter.
	Adjust for kernargs_abi refactor and ephemeral memories.
	(release_kernel_dispatch): Adjust for ephemeral memories.
	(run_kernel): Pass thread-count to create_kernel_dispatch.
	(GOMP_OFFLOAD_init_device): Adjust for ephemeral memories.
	(GOMP_OFFLOAD_fini_device): Adjust for ephemeral memories.

gcc/testsuite/ChangeLog:

	* gcc.c-torture/execute/pr47237.c: Xfail on amdgcn.
	* gcc.dg/builtin-apply3.c: Xfail for amdgcn.
	* gcc.dg/builtin-apply4.c: Xfail for amdgcn.
	* gcc.dg/torture/stackalign/builtin-apply-3.c: Xfail for amdgcn.
	* gcc.dg/torture/stackalign/builtin-apply-4.c: Xfail for amdgcn.
2023-02-02 11:47:03 +00:00
Jakub Jelinek 83ffe9cde7 Update copyright years. 2023-01-16 11:52:17 +01:00
GCC Administrator de282a2012 Daily bump. 2022-12-22 00:17:29 +00:00
Jason Merrill 302485a70a c++: source position of lambda captures [PR84471]
If the DECL_VALUE_EXPR of a VAR_DECL has EXPR_LOCATION set, then any use of
that variable looks like it has that location, which leads to the debugger
jumping back and forth for both lambdas and structured bindings.

Rather than fix all the uses, it seems simplest to remove any EXPR_LOCATION
when setting DECL_VALUE_EXPR.  So the cp/ hunks aren't necessary, but they
avoid the need to unshare to remove the location.

	PR c++/84471
	PR c++/107504

gcc/cp/ChangeLog:

	* coroutines.cc (transform_local_var_uses): Don't
	specify a location for DECL_VALUE_EXPR.
	* decl.cc (cp_finish_decomp): Likewise.

gcc/ChangeLog:

	* fold-const.cc (protected_set_expr_location_unshare): Not static.
	* tree.h: Declare it.
	* tree.cc (decl_value_expr_insert): Use it.

include/ChangeLog:

	* ansidecl.h (ATTRIBUTE_WARN_UNUSED_RESULT): Add __.

gcc/testsuite/ChangeLog:

	* g++.dg/tree-ssa/value-expr1.C: New test.
	* g++.dg/tree-ssa/value-expr2.C: New test.
	* g++.dg/analyzer/pr93212.C: Move warning.
2022-12-20 21:01:44 -05:00
GCC Administrator 26f4aefaeb Daily bump. 2022-12-15 00:17:29 +00:00
David Faust 70b303049e btf: correct generation for extern funcs [PR106773]
The eBPF loader expects to find entries for functions declared as extern
in the corresponding BTF_KIND_DATASEC record, but we were not generating
these entries.

This patch adds support for the 'extern' linkage of function types in
BTF, and creates entries for for them BTF_KIND_DATASEC records as needed.

	PR target/106773

gcc/

	* btfout.cc (get_section_name): New function.
	(btf_collect_datasec): Use it here. Process functions, marking them
	'extern' and generating DATASEC entries for them as appropriate. Move
	creation of BTF_KIND_FUNC records to here...
	(btf_dtd_emit_preprocess_cb): ... from here.

gcc/testsuite/

	* gcc.dg/debug/btf/btf-datasec-2.c: New test.
	* gcc.dg/debug/btf/btf-function-6.c: New test.

include/

	* btf.h (enum btf_func_linkage): New.
	(struct btf_var_secinfo): Update comments with notes about extern
	functions.
2022-12-14 10:20:53 -08:00
David Faust 2bce22e88e btf: add 'extern' linkage for variables [PR106773]
Add support for the 'extern' linkage value for BTF_KIND_VAR records,
which is used for variables declared as extern in the source file.

This also fixes a bug with BTF generation for extern variables which
have both a non-defining declaration and a defining declaration in the
same CU.

	PR target/106773

gcc/

	* btfout.cc (btf_collect_datasec): Mark extern variables as such.
	(btf_dvd_emit_preprocess_cb): Skip non-defining extern variable decl
	if there is a defining decl for the same variable.
	(btf_asm_varent): Accomodate 'extern' linkage.

gcc/testsuite/

	* gcc.dg/debug/btf/btf-variables-4.c: New test.
	* gcc.dg/debug/btf/btf-variables-5.c: New test.

include/

	* btf.h (enum btf_var_linkage): New.
	(struct btf_var): Update comment to note 'extern' linkage.
2022-12-14 10:20:52 -08:00
GCC Administrator 4bc2d9f6cb Daily bump. 2022-12-08 00:17:45 +00:00
Max Filippov ecb575d09c gcc: xtensa: allow dynamic configuration
Import include/xtensa-dynconfig.h that defines XCHAL_* macros as fields
of a structure returned from the xtensa_get_config_v<x> function call.
Define that structure and fill it with default parameter values
specified in the include/xtensa-config.h.
Define reusable function xtensa_load_config that tries to load
configuration and return an address of an exported object from it.
Define the function xtensa_get_config_v1 that uses xtensa_load_config
to get structure xtensa_config_v1, either dynamically configured or the
default.

Provide essential XCHAL_* configuration parameters as __XCHAL_* built-in
macros. This way it will be possible to use them in libgcc and libc
without need to patch libgcc or libc source for the specific xtensa core
configuration.

gcc/
	* config.gcc (xtensa*-*-*): Add xtensa-dynconfig.o to extra_objs.
	* config/xtensa/t-xtensa (TM_H): Add xtensa-dynconfig.h.
	(xtensa-dynconfig.o): New rule.
	* config/xtensa/xtensa-dynconfig.c: New file.
	* config/xtensa/xtensa-protos.h (xtensa_get_config_strings): New
	declaration.
	* config/xtensa/xtensa.h (xtensa-config.h): Replace #include
	with xtensa-dynconfig.h
	(XCHAL_HAVE_MUL32_HIGH, XCHAL_HAVE_RELEASE_SYNC)
	(XCHAL_HAVE_S32C1I, XCHAL_HAVE_THREADPTR)
	(XCHAL_HAVE_FP_POSTINC): Drop definitions.
	(TARGET_DIV32): Replace with __XCHAL_HAVE_DIV32.
	(TARGET_CPU_CPP_BUILTINS): Add new 'builtin' variable and loop
	through string array returned by the xtensa_get_config_strings
	function call.

include/
	* xtensa-dynconfig.h: New file.
2022-12-07 10:00:48 -08:00
GCC Administrator cdc34229c1 Daily bump. 2022-11-16 00:17:09 +00:00
Nathan Sidwell 46c3d9c8e8 demangler: Templated lambda demangling
Templated lambdas have a template-head, which is part of their
signature.  GCC ABI 18 mangles that into the lambda name.  This adds
support to the demangler.  We have to introduce artificial template
parameter names, as we need to refer to them from later components of
the lambda signature. We use $T:n, $N:n and $TT:n for type, non-type
and template parameters.  Non-type parameter names are not shown in
the strictly correct location -- for instance 'int (&NT) ()' would be
shown as 'int (&) $N:n'.  That's unfortunate, but an orthogonal issue.
The 'is_lambda_arg' field is now repurposed as indicating the number
of explicit template parameters (1-based).

	include/
	* demangle.h (enum demangle_component_type): Add
	DEMANGLE_COMPONENT_TEMPLATE_HEAD,
	DEMANGLE_COMPONENT_TEMPLATE_TYPE_PARM,
	DEMANGLE_COMPONENT_TEMPLATE_NON_TYPE_PARM,
	DEMANGLE_COMPONENT_TEMPLATE_TEMPLATE_PARM,
	DEMANGLE_COMPONENT_TEMPLATE_PACK_PARM.
	libiberty/
	* cp-demangle.c (struct d_print_info): Rename is_lambda_arg to
	lambda_tpl_parms.  Augment semantics.
	(d_make_comp): Add checks for new components.
	(d_template_parm, d_template_head): New.
	(d_lambda): Add templated lambda support.
	(d_print_init): Adjust.
	(d_print_lambda_parm_name): New.
	(d_print_comp_inner): Support templated lambdas,
	* testsuite/demangle-expected: Add testcases.
2022-11-15 13:34:56 -05:00
Jakub Jelinek 89d0a14a1f Manually add ChangeLog entries from r13-3652-ge4cba49413ca429dc82f6aa2e88129ecb3fdd943
This commit caused failure of update_version_git due to the removal of
liboffloadmic with ChangeLog in it, so I had to blacklist that commit
and here I'm adding ChangeLog entries manually.
2022-11-06 12:12:47 +01:00
Thomas Schwinge e4cba49413 Remove support for Intel MIC offloading
... after its deprecation in GCC 12.

	* Makefile.def: Remove module 'liboffloadmic'.
	* Makefile.in: Regenerate.
	* configure.ac: Remove 'liboffloadmic' handling.
	* configure: Regenerate.
	contrib/
	* gcc-changelog/git_commit.py (default_changelog_locations):
	Remove 'liboffloadmic'.
	* gcc_update (files_and_dependencies): Remove 'liboffloadmic'
	files.
	* update-copyright.py (GCCCmdLine): Remove 'liboffloadmic'
	comment.
	gcc/
	* config.gcc [target *-intelmic-* | *-intelmicemul-*]: Remove.
	* config/i386/i386-options.cc (ix86_omp_device_kind_arch_isa)
	[ACCEL_COMPILER]: Remove.
	* config/i386/intelmic-mkoffload.cc: Remove.
	* config/i386/intelmic-offload.h: Likewise.
	* config/i386/t-intelmic: Likewise.
	* config/i386/t-omp-device: Likewise.
	* configure.ac [target *-intelmic-* | *-intelmicemul-*]: Remove.
	* configure: Regenerate.
	* doc/install.texi (--enable-offload-targets=[...]): Update.
	* doc/sourcebuild.texi: Remove 'liboffloadmic' documentation.
	include/
	* gomp-constants.h (GOMP_DEVICE_INTEL_MIC): Comment out.
	(GOMP_VERSION_INTEL_MIC): Remove.
	libgomp/
	* libgomp-plugin.h (OFFLOAD_TARGET_TYPE_INTEL_MIC): Remove.
	* libgomp.texi (OpenMP Context Selectors): Remove Intel MIC
	documentation.
	* plugin/configfrag.ac <enable_offload_targets>
	[*-intelmic-* | *-intelmicemul-*]: Remove.
	* configure: Regenerate.
	* testsuite/lib/libgomp.exp (libgomp_init): Remove 'liboffloadmic'
	handling.
	(offload_target_to_openacc_device_type)
	[$offload_target = *-intelmic*]: Remove.
	(check_effective_target_offload_device_intel_mic)
	(check_effective_target_offload_device_any_intel_mic): Remove.
	* testsuite/libgomp.c-c++-common/on_device_arch.h
	(device_arch_intel_mic, on_device_arch_intel_mic, any_device_arch)
	(any_device_arch_intel_mic): Remove.
	* testsuite/libgomp.c-c++-common/target-45.c: Remove
	'offload_device_any_intel_mic' XFAIL.
	* testsuite/libgomp.fortran/target10.f90: Likewise.
	liboffloadmic/
	* ChangeLog: Remove.
	* Makefile.am: Likewise.
	* Makefile.in: Likewise.
	* aclocal.m4: Likewise.
	* configure: Likewise.
	* configure.ac: Likewise.
	* configure.tgt: Likewise.
	* doc/doxygen/config: Likewise.
	* doc/doxygen/header.tex: Likewise.
	* include/coi/common/COIEngine_common.h: Likewise.
	* include/coi/common/COIEvent_common.h: Likewise.
	* include/coi/common/COIMacros_common.h: Likewise.
	* include/coi/common/COIPerf_common.h: Likewise.
	* include/coi/common/COIResult_common.h: Likewise.
	* include/coi/common/COISysInfo_common.h: Likewise.
	* include/coi/common/COITypes_common.h: Likewise.
	* include/coi/sink/COIBuffer_sink.h: Likewise.
	* include/coi/sink/COIPipeline_sink.h: Likewise.
	* include/coi/sink/COIProcess_sink.h: Likewise.
	* include/coi/source/COIBuffer_source.h: Likewise.
	* include/coi/source/COIEngine_source.h: Likewise.
	* include/coi/source/COIEvent_source.h: Likewise.
	* include/coi/source/COIPipeline_source.h: Likewise.
	* include/coi/source/COIProcess_source.h: Likewise.
	* liboffloadmic_host.spec.in: Likewise.
	* liboffloadmic_target.spec.in: Likewise.
	* plugin/Makefile.am: Likewise.
	* plugin/Makefile.in: Likewise.
	* plugin/aclocal.m4: Likewise.
	* plugin/configure: Likewise.
	* plugin/configure.ac: Likewise.
	* plugin/libgomp-plugin-intelmic.cpp: Likewise.
	* plugin/offload_target_main.cpp: Likewise.
	* runtime/cean_util.cpp: Likewise.
	* runtime/cean_util.h: Likewise.
	* runtime/coi/coi_client.cpp: Likewise.
	* runtime/coi/coi_client.h: Likewise.
	* runtime/coi/coi_server.cpp: Likewise.
	* runtime/coi/coi_server.h: Likewise.
	* runtime/compiler_if_host.cpp: Likewise.
	* runtime/compiler_if_host.h: Likewise.
	* runtime/compiler_if_target.cpp: Likewise.
	* runtime/compiler_if_target.h: Likewise.
	* runtime/dv_util.cpp: Likewise.
	* runtime/dv_util.h: Likewise.
	* runtime/emulator/coi_common.h: Likewise.
	* runtime/emulator/coi_device.cpp: Likewise.
	* runtime/emulator/coi_device.h: Likewise.
	* runtime/emulator/coi_host.cpp: Likewise.
	* runtime/emulator/coi_host.h: Likewise.
	* runtime/emulator/coi_version_asm.h: Likewise.
	* runtime/emulator/coi_version_linker_script.map: Likewise.
	* runtime/liboffload_error.c: Likewise.
	* runtime/liboffload_error_codes.h: Likewise.
	* runtime/liboffload_msg.c: Likewise.
	* runtime/liboffload_msg.h: Likewise.
	* runtime/mic_lib.f90: Likewise.
	* runtime/offload.h: Likewise.
	* runtime/offload_common.cpp: Likewise.
	* runtime/offload_common.h: Likewise.
	* runtime/offload_engine.cpp: Likewise.
	* runtime/offload_engine.h: Likewise.
	* runtime/offload_env.cpp: Likewise.
	* runtime/offload_env.h: Likewise.
	* runtime/offload_host.cpp: Likewise.
	* runtime/offload_host.h: Likewise.
	* runtime/offload_iterator.h: Likewise.
	* runtime/offload_omp_host.cpp: Likewise.
	* runtime/offload_omp_target.cpp: Likewise.
	* runtime/offload_orsl.cpp: Likewise.
	* runtime/offload_orsl.h: Likewise.
	* runtime/offload_table.cpp: Likewise.
	* runtime/offload_table.h: Likewise.
	* runtime/offload_target.cpp: Likewise.
	* runtime/offload_target.h: Likewise.
	* runtime/offload_target_main.cpp: Likewise.
	* runtime/offload_timer.h: Likewise.
	* runtime/offload_timer_host.cpp: Likewise.
	* runtime/offload_timer_target.cpp: Likewise.
	* runtime/offload_trace.cpp: Likewise.
	* runtime/offload_trace.h: Likewise.
	* runtime/offload_util.cpp: Likewise.
	* runtime/offload_util.h: Likewise.
	* runtime/ofldbegin.cpp: Likewise.
	* runtime/ofldend.cpp: Likewise.
	* runtime/orsl-lite/include/orsl-lite.h: Likewise.
	* runtime/orsl-lite/lib/orsl-lite.c: Likewise.
	* runtime/orsl-lite/version.txt: Likewise.
2022-11-04 10:51:01 +01:00
GCC Administrator 9a8b868d7a Daily bump. 2022-11-01 00:19:02 +00:00
Guillermo E. Martinez 8422861bdd btf: Add support to BTF_KIND_ENUM64 type
BTF supports 64-bits enumerators with following encoding:

  struct btf_type:
    name_off: 0 or offset to a valid C identifier
    info.kind_flag: 0 for unsigned, 1 for signed
    info.kind: BTF_KIND_ENUM64
    info.vlen: number of enum values
    size: 1/2/4/8

The btf_type is followed by info.vlen number of:

    struct btf_enum64
    {
      uint32_t name_off;   /* Offset in string section of enumerator name.  */
      uint32_t val_lo32;   /* lower 32-bit value for a 64-bit value Enumerator */
      uint32_t val_hi32;   /* high 32-bit value for a 64-bit value Enumerator */
    };

So, a new btf_enum64 structure was added to represent BTF_KIND_ENUM64
and a new field dtd_enum_unsigned in ctf_dtdef structure to distinguish
when CTF enum is a signed or unsigned type, later that information is
used to encode the BTF enum type.

gcc/ChangeLog:

	* btfout.cc (btf_calc_num_vbytes): Compute enumeration size depending of
	enumerator type btf_enum{,64}.
	(btf_asm_type): Update btf_kflag according to enumeration type sign
	using dtd_enum_unsigned field for both:  BTF_KIND_ENUM{,64}.
	(btf_asm_enum_const): New argument to represent the size of
	the BTF enum type, writing the enumerator constant value for
	32 bits, if it's 64 bits then explicitly writes lower 32-bits
	value and higher 32-bits value.
	(output_asm_btf_enum_list): Add enumeration size argument.
	* ctfc.cc (ctf_add_enum): New argument to represent CTF enum
	basic information.
	(ctf_add_generic): Use of ei_{name. size, unsigned} to build the
	dtd structure containing enumeration information.
	(ctf_add_enumerator): Update comment mention support for BTF
	enumeration in 64-bits.
	* dwarf2ctf.cc (gen_ctf_enumeration_type): Extract signedness
	for enumeration type and use it in ctf_add_enum.
	* ctfc.h (ctf_dmdef): Update dmd_value to HOST_WIDE_INT to allow
	use 32/64 bits enumerators.
	information.
	(ctf_dtdef): New field to describe enum signedness.

include/
	* btf.h (btf_enum64): Add new definition and new symbolic
	constant to BTF_KIND_ENUM64 and BTF_KF_ENUM_{UN,}SIGNED.

gcc/testsuite/ChangeLog:

	* gcc.dg/debug/btf/btf-enum-1.c: Update testcase, with correct
	info.kflags encoding.
	* gcc.dg/debug/btf/btf-enum64-1.c: New testcase.
2022-10-31 09:34:22 -07:00
Lulu Cheng 27b9e1158b Libvtv: Add loongarch support.
The loongarch64 specification permits page sizes of 4KiB, 16KiB and 64KiB,
but only 16KiB pages are supported for now.

Co-Authored-By: qijingwen <qijingwen@loongson.cn>

include/ChangeLog:

	* vtv-change-permission.h (defined): Determines whether the macro
	__loongarch_lp64 is defined
	(VTV_PAGE_SIZE): Set VTV_PAGE_SIZE to 16KiB for loongarch64.

libvtv/ChangeLog:

	* configure.tgt: Add loongarch support.
2022-10-31 17:35:02 +08:00
GCC Administrator 4e939ae1cf Daily bump. 2022-10-25 00:17:33 +00:00
Tobias Burnus 131d18e928 libgomp/nvptx: Prepare for reverse-offload callback handling
This patch adds a stub 'gomp_target_rev' in the host's target.c, which will
later handle the reverse offload.
For nvptx, it adds support for forwarding the offload gomp_target_ext call
to the host by setting values in a struct on the device and querying it on
the host - invoking gomp_target_rev on the result.

include/ChangeLog:

	* cuda/cuda.h (enum CUdevice_attribute): Add
	CU_DEVICE_ATTRIBUTE_UNIFIED_ADDRESSING.
	(CU_MEMHOSTALLOC_DEVICEMAP): Define.
	(cuMemHostAlloc): Add prototype.

libgomp/ChangeLog:

	* config/nvptx/icv-device.c (GOMP_DEVICE_NUM_VAR): Remove
	'static' for this variable.
	* config/nvptx/libgomp-nvptx.h: New file.
	* config/nvptx/target.c: Include it.
	(GOMP_ADDITIONAL_ICVS): Declare extern var.
	(GOMP_REV_OFFLOAD_VAR): Declare var.
	(GOMP_target_ext): Handle reverse offload.
	* libgomp-plugin.h (GOMP_PLUGIN_target_rev): New prototype.
	* libgomp-plugin.c (GOMP_PLUGIN_target_rev): New, call ...
	* target.c (gomp_target_rev): ... this new stub function.
	* libgomp.h (gomp_target_rev): Declare.
	* libgomp.map (GOMP_PLUGIN_1.4): New; add GOMP_PLUGIN_target_rev.
	* plugin/cuda-lib.def (cuMemHostAlloc): Add.
	* plugin/plugin-nvptx.c: Include libgomp-nvptx.h.
	(struct ptx_device): Add rev_data member.
	(nvptx_open_device): Remove async_engines query, last used in
	r10-304-g1f4c5b9b; add unified-address assert check.
	(GOMP_OFFLOAD_get_num_devices): Claim unified address
	support.
	(GOMP_OFFLOAD_load_image): Free rev_fn_table if no
	offload functions exist. Make offload var available
	on host and device.
	(rev_off_dev_to_host_cpy, rev_off_host_to_dev_cpy): New.
	(GOMP_OFFLOAD_run): Handle reverse offload.
2022-10-24 17:04:08 +02:00
GCC Administrator 47a6ae5658 Daily bump. 2022-10-21 00:17:52 +00:00
Tobias Burnus 12d9f5afbd libgomp: Add offload_device_gcn check, add requires-4a.c test
Duplicate libgomp.c-c++-common/requires-4.c (as ...-4a.c) but
with using a heap-allocated instead of static memory for a variable.

This change and the added offload_device_gcn check prepare for
pseudo-USM, where the device hardware cannot access all host
memory but only managed and pinned memory; for those, requires-4.c
will fail and the new check permits to add
  target { ! { offload_device_nvptx || offload_device_gcn } }
to requires-4.c; however, it has not been added yet as pseuo-USM
support is not yet on mainline. (Review is pending for the USM
patches.)

include/ChangeLog:

	* gomp-constants.h (GOMP_DEVICE_HSA): Comment out unused define.

libgomp/ChangeLog:

	* testsuite/lib/libgomp.exp (check_effective_target_offload_device_gcn):
	New.
	* testsuite/libgomp.c-c++-common/on_device_arch.h (device_arch_gcn,
	on_device_arch_gcn): New.
	* testsuite/libgomp.c-c++-common/requires-4a.c: New test; copied from
	requires-4.c but using heap-allocated memory.
2022-10-20 12:58:52 +02:00
GCC Administrator 621a911d33 Daily bump. 2022-10-14 00:16:35 +00:00
Xi Ruoyao 6f653a2c85
LoongArch: implement count_{leading,trailing}_zeros
LoongArch always support clz and ctz instructions, so we can always use
__builtin_{clz,ctz} for count_{leading,trailing}_zeros.  This improves
the code of libgcc, and also benefits Glibc once we merge longlong.h
there.

Bootstrapped and regtested on loongarch64-linux-gnu.

include/ChangeLog:

	* longlong.h [__loongarch__] (count_leading_zeros): Define.
	[__loongarch__] (count_trailing_zeros): Likewise.
	[__loongarch__] (COUNT_LEADING_ZEROS_0): Likewise.
2022-10-13 18:05:22 +08:00
GCC Administrator 1f16a020ac Daily bump. 2022-09-28 00:17:27 +00:00
Jakub Jelinek b04208895f c++: Implement P1467R9 - Extended floating-point types and standard names compiler part except for bfloat16 [PR106652]
The following patch implements the compiler part of C++23
P1467R9 - Extended floating-point types and standard names compiler part
by introducing _Float{16,32,64,128} as keywords and builtin types
like they are implemented for C already since GCC 7, with DF{16,32,64,128}_
mangling.
It also introduces _Float{32,64,128}x for C++ with the
https://github.com/itanium-cxx-abi/cxx-abi/pull/147
proposed mangling of DF{32,64,128}x.
The patch doesn't add anything for bfloat16_t support, as right now
__bf16 type refuses all conversions and arithmetic operations.
The patch wants to keep backwards compatibility with how __float128 has
been handled in C++ before, both for mangling and behavior in binary
operations, overload resolution etc.  So, there are some backend changes
where for C __float128 and _Float128 are the same type (float128_type_node
and float128t_type_node are the same pointer), but for C++ they are distinct
types which mangle differently and _Float128 is treated as extended
floating-point type while __float128 is treated as non-standard floating
point type.  The various C++23 changes about how floating-point types
are changed are actually implemented as written in the spec only if at least
one of the types involved is _Float{16,32,64,128,32x,64x,128x} (_FloatNx are
also treated as extended floating-point types) and kept previous behavior
otherwise.  For float/double/long double the rules are actually written that
they behave the same as before.
There is some backwards incompatibility at least on x86 regarding _Float16,
because that type was already used by that name and with the DF16_ mangling
(but only since GCC 12 and I think it isn't that widely used in the wild
yet).  E.g. config/i386/avx512fp16intrin.h shows the issues, where
in C or in GCC 12 in C++ one could pass 0.0f to a builtin taking _Float16
argument, but with the changes that is not possible anymore, one needs
to either use 0.0f16 or (_Float16) 0.0f.
We have also a problem with glibc headers, where since glibc 2.27
math.h and complex.h aren't compilable with these changes.  One gets
errors like:
In file included from /usr/include/math.h:43,
                 from abc.c:1:
/usr/include/bits/floatn.h:86:9: error: multiple types in one declaration
   86 | typedef __float128 _Float128;
      |         ^~~~~~~~~~
/usr/include/bits/floatn.h:86:20: error: declaration does not declare anything [-fpermissive]
   86 | typedef __float128 _Float128;
      |                    ^~~~~~~~~
In file included from /usr/include/bits/floatn.h:119:
/usr/include/bits/floatn-common.h:214:9: error: multiple types in one declaration
  214 | typedef float _Float32;
      |         ^~~~~
/usr/include/bits/floatn-common.h:214:15: error: declaration does not declare anything [-fpermissive]
  214 | typedef float _Float32;
      |               ^~~~~~~~
/usr/include/bits/floatn-common.h:251:9: error: multiple types in one declaration
  251 | typedef double _Float64;
      |         ^~~~~~
/usr/include/bits/floatn-common.h:251:16: error: declaration does not declare anything [-fpermissive]
  251 | typedef double _Float64;
      |                ^~~~~~~~
This is from snippets like:
 /* The remaining of this file provides support for older compilers.  */
 # if __HAVE_FLOAT128

 /* The type _Float128 exists only since GCC 7.0.  */
 #  if !__GNUC_PREREQ (7, 0) || defined __cplusplus
 typedef __float128 _Float128;
 #  endif
where it hardcodes that C++ doesn't have _Float{16,32,64,128,32x,64x,128x} support nor
{f,F}{16,32,64,128}{,x} literal suffixes nor _Complex _Float{16,32,64,128,32x,64x,128x}.
The patch fixincludes this for now and hopefully if this is committed, then
glibc can change those.  The patch changes those
 #  if !__GNUC_PREREQ (7, 0) || defined __cplusplus
conditions to
 #  if !__GNUC_PREREQ (7, 0) || (defined __cplusplus && !__GNUC_PREREQ (13, 0))
Another thing is mangling, as said above, Itanium C++ ABI specifies
DF <number> _ as _Float{16,32,64,128} mangling, but GCC was implementing
a mangling incompatible with that starting with DF for fixed point types.
Fixed point was never supported in C++ though, I believe the reason why
the mangling has been added was that due to a bug it would leak into the
C++ FE through decltype (0.0r) etc.  But that has been shortly after the
mangling was added fixed (I think in the same GCC release cycle), so we
now reject 0.0r etc. in C++.  If we ever need the fixed point mangling,
I think it can be readded but better with a different prefix so that it
doesn't conflict with the published standard manglings.  So, this patch
also kills the fixed point mangling and implements the DF <number> _
demangling.
The patch predefines __STDCPP_FLOAT{16,32,64,128}_T__ macros when
those types are available, but only for C++23, while the underlying types
are available in C++98 and later including the {f,F}{16,32,64,128} literal
suffixes (but those with a pedwarn for C++20 and earlier).  My understanding
is that it needs to be predefined by the compiler, on the other side
predefining even for older modes when <stdfloat> is a new C++23 header
would be weird.  One can find out if _Float{16,32,64,128,32x,64x,128x} is
supported in C++ by
__GNUC__ >= 13 && defined(__FLT{16,32,64,128,32X,64X,128X}_MANT_DIG__)
(but that doesn't work well with older G++ 13 snapshots).

As for std::bfloat16_t, three targets (aarch64, arm and x86) apparently
"support" __bf16 type which has the bfloat16 format, but isn't really
usable, e.g. {aarch64,arm,ix86}_invalid_conversion disallow any conversions
from or to type with BFmode, {aarch64,arm,ix86}_invalid_unary_op disallows
any unary operations on those except for ADDR_EXPR and
{aarch64,arm,ix86}_invalid_binary_op disallows any binary operation on
those.  So, I think we satisfy:
"If the implementation supports an extended floating-point type with the
properties, as specified by ISO/IEC/IEEE 60559, of radix (b) of 2, storage
width in bits (k) of 16, precision in bits (p) of 8, maximum exponent (emax)
of 127, and exponent field width in bits (w) of 8, then the typedef-name
std::bfloat16_t is defined in the header <stdfloat> and names such a type,
the macro __STDCPP_BFLOAT16_T__ is defined, and the floating-point literal
suffixes bf16 and BF16 are supported."
because we don't really support those right now.

2022-09-27  Jakub Jelinek  <jakub@redhat.com>

	PR c++/106652
	PR c++/85518
gcc/
	* tree-core.h (enum tree_index): Add TI_FLOAT128T_TYPE
	enumerator.
	* tree.h (float128t_type_node): Define.
	* tree.cc (build_common_tree_nodes): Initialize float128t_type_node.
	* builtins.def (DEF_FLOATN_BUILTIN): Adjust comment now that
	_Float<N> is supported in C++ too.
	* config/i386/i386.cc (ix86_mangle_type): Only mangle as "g"
	float128t_type_node.
	* config/i386/i386-builtins.cc (ix86_init_builtin_types): Use
	float128t_type_node for __float128 instead of float128_type_node
	and create it if NULL.
	* config/i386/avx512fp16intrin.h (_mm_setzero_ph, _mm256_setzero_ph,
	_mm512_setzero_ph, _mm_set_sh, _mm_load_sh): Use 0.0f16 instead of
	0.0f.
	* config/ia64/ia64.cc (ia64_init_builtins): Use
	float128t_type_node for __float128 instead of float128_type_node
	and create it if NULL.
	* config/rs6000/rs6000-c.cc (is_float128_p): Also return true
	for float128t_type_node if non-NULL.
	* config/rs6000/rs6000.cc (rs6000_mangle_type): Don't mangle
	float128_type_node as "u9__ieee128".
	* config/rs6000/rs6000-builtin.cc (rs6000_init_builtins): Use
	float128t_type_node for __float128 instead of float128_type_node
	and create it if NULL.
gcc/c-family/
	* c-common.cc (c_common_reswords): Change _Float{16,32,64,128} and
	_Float{32,64,128}x flags from D_CONLY to 0.
	(shorten_binary_op): Punt if common_type returns error_mark_node.
	(shorten_compare): Likewise.
	(c_common_nodes_and_builtins): For C++ record _Float{16,32,64,128}
	and _Float{32,64,128}x builtin types if available.  For C++
	clear float128t_type_node.
	* c-cppbuiltin.cc (c_cpp_builtins): Predefine
	__STDCPP_FLOAT{16,32,64,128}_T__ for C++23 if supported.
	* c-lex.cc (interpret_float): For q/Q suffixes prefer
	float128t_type_node over float128_type_node.  Allow
	{f,F}{16,32,64,128} suffixes for C++ if supported with pedwarn
	for C++20 and older.  Allow {f,F}{32,64,128}x suffixes for C++
	with pedwarn.  Don't call excess_precision_type for C++.
gcc/cp/
	* cp-tree.h (cp_compare_floating_point_conversion_ranks): Implement
	P1467R9 - Extended floating-point types and standard names except
	for std::bfloat16_t for now.  Declare.
	(extended_float_type_p): New inline function.
	* mangle.cc (write_builtin_type): Mangle float{16,32,64,128}_type_node
	as DF{16,32,64,128}_.  Mangle float{32,64,128}x_type_node as
	DF{32,64,128}x.  Remove FIXED_POINT_TYPE mangling that conflicts
	with that.
	* typeck2.cc (check_narrowing): If one of ftype or type is extended
	floating-point type, compare floating-point conversion ranks.
	* parser.cc (cp_keyword_starts_decl_specifier_p): Handle
	CASE_RID_FLOATN_NX.
	(cp_parser_simple_type_specifier): Likewise and diagnose missing
	_Float<N> or _Float<N>x support if not supported by target.
	* typeck.cc (cp_compare_floating_point_conversion_ranks): New function.
	(cp_common_type): If both types are REAL_TYPE and one or both are
	extended floating-point types, select common type based on comparison
	of floating-point conversion ranks and subranks.
	(cp_build_binary_op): Diagnose operation with floating point arguments
	with unordered conversion ranks.
	* call.cc (standard_conversion): For floating-point conversion, if
	either from or to are extended floating-point types, set conv->bad_p
	for implicit conversion from larger to smaller conversion rank or
	with unordered conversion ranks.
	(convert_like_internal): Emit a pedwarn on such conversions.
	(build_conditional_expr): Diagnose operation with floating point
	arguments with unordered conversion ranks.
	(convert_arg_to_ellipsis): Don't promote extended floating-point types
	narrower than double to double.
	(compare_ics): Implement P1467R9 [over.ics.rank]/4 changes.
gcc/testsuite/
	* g++.dg/cpp23/ext-floating1.C: New test.
	* g++.dg/cpp23/ext-floating2.C: New test.
	* g++.dg/cpp23/ext-floating3.C: New test.
	* g++.dg/cpp23/ext-floating4.C: New test.
	* g++.dg/cpp23/ext-floating5.C: New test.
	* g++.dg/cpp23/ext-floating6.C: New test.
	* g++.dg/cpp23/ext-floating7.C: New test.
	* g++.dg/cpp23/ext-floating8.C: New test.
	* g++.dg/cpp23/ext-floating9.C: New test.
	* g++.dg/cpp23/ext-floating10.C: New test.
	* g++.dg/cpp23/ext-floating.h: New file.
	* g++.target/i386/float16-1.C: Adjust expected diagnostics.
libcpp/
	* expr.cc (interpret_float_suffix): Allow {f,F}{16,32,64,128} and
	{f,F}{32,64,128}x suffixes for C++.
include/
	* demangle.h (enum demangle_component_type): Add
	DEMANGLE_COMPONENT_EXTENDED_BUILTIN_TYPE.
	(struct demangle_component): Add u.s_extended_builtin member.
libiberty/
	* cp-demangle.c (d_dump): Handle
	DEMANGLE_COMPONENT_EXTENDED_BUILTIN_TYPE.  Don't handle
	DEMANGLE_COMPONENT_FIXED_TYPE.
	(d_make_extended_builtin_type): New function.
	(cplus_demangle_builtin_types): Add _Float entry.
	(cplus_demangle_type): For DF demangle it as _Float<N> or
	_Float<N>x rather than fixed point which conflicts with it.
	(d_count_templates_scopes): Handle
	DEMANGLE_COMPONENT_EXTENDED_BUILTIN_TYPE.  Just break; for
	DEMANGLE_COMPONENT_FIXED_TYPE.
	(d_find_pack): Handle DEMANGLE_COMPONENT_EXTENDED_BUILTIN_TYPE.
	Don't handle DEMANGLE_COMPONENT_FIXED_TYPE.
	(d_print_comp_inner): Likewise.
	* cp-demangle.h (D_BUILTIN_TYPE_COUNT): Bump.
	* testsuite/demangle-expected: Replace _Z3xxxDFyuVb test
	with _Z3xxxDF16_DF32_DF64_DF128_CDF16_Vb.  Add
	_Z3xxxDF32xDF64xDF128xCDF32xVb test.
fixincludes/
	* inclhack.def (glibc_cxx_floatn_1, glibc_cxx_floatn_2,
	glibc_cxx_floatn_3): New fixes.
	* tests/base/bits/floatn.h: New file.
	* fixincl.x: Regenerated.
2022-09-27 08:18:00 +02:00
Meghan Denny 8be65640e1 Updated constants from <https://dwarfstd.org/Languages.php>
include
	* dwarf2.h: Update with additional languages from dwarf
	standard.
2022-09-26 23:51:52 -04:00