Commit Graph

205 Commits

Author SHA1 Message Date
Martin Jambor c56684fd61 Removal of HSA offloading from gcc and libgomp
This patch removes the generation of HSAIL from the compiler, the HSA
offloading plugin from libgomp and the associated testsuite tests and
infrastructure bits from the respective testsuites.

Apart from removal of the obvious files, I removed bits that I found
by searching for HSA related terms and by re-tracing my steps and
looking at the patches that introduced HSA in the first place.  I did
not remove everything these patches brought in, for example:

  - the mechanism to pass offload-target specific info from the application to
    the offloading plugin - but the same mechanism is also used to
    communicate number of teams and the thread limit to all offload targets.

  - run_func hook in gomp_device_descr stays too, although now it is
    not used.  If some future offload target would like the ability to
    refuse to offload some functions, it can use it.  It is easy to
    remove as a follow-up if it is considered clutter, though.

  - configure options --with-hsa-runtime=PATH, -with-hsa-runtime-include=PATH
    and --with-hsa-runtime-lib=PATH rmeain because GCN uses them too.

  - Surprisingly, GOMP_TARGET_ARG_HSA_KERNEL_ATTRIBUTES (a constant
    from gomp-constants.h) appears in the source of the amdgcn libgomp
    plugin, although I tend to think that code path is not ever used
    and this patch certainly removes it from the compiler.
    Nevertheless, it seems it has potential value beyond HSAIL and so
    I've kept it, it can of course always be easily removed in the
    future of GCN folk abandon it too.

  - I assume constants OFFLOAD_TARGET_TYPE_HSA and GOMP_DEVICE_HSA
    need to stay indefinitely too just so that no future offload
    target picks that number.

  - I have kept dg-require-effective-target
    offload_device_nonshared_as requirement of thests which have it.

It is quite probable I missed some small HSA artifacts but those
should be easy to remove later as we find them.

include/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* gomp-constants.h (GOMP_VERSION_HSA): Remove.

gcc/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* hsa-brig-format.h: Moved to brig/brigfrontend.
	* hsa-brig.c: Removed.
	* hsa-builtins.def: Likewise.
	* hsa-common.c: Likewise.
	* hsa-common.h: Likewise.
	* hsa-dump.c: Likewise.
	* hsa-gen.c: Likewise.
	* hsa-regalloc.c: Likewise.
	* ipa-hsa.c: Likewise.
	* omp-grid.c: Likewise.
	* omp-grid.h: Likewise.
	* Makefile.in (BUILTINS_DEF): Remove hsa-builtins.def.
	(OBJS): Remove hsa-common.o, hsa-gen.o, hsa-regalloc.o, hsa-brig.o,
	hsa-dump.o, ipa-hsa.c and omp-grid.o.
	(GTFILES): Removed hsa-common.c and omp-expand.c.
	* builtins.def: Remove processing of hsa-builtins.def.
	(DEF_HSA_BUILTIN): Remove.
	* common.opt (flag_disable_hsa): Remove.
	(-Whsa): Ignore.
	* config.in (ENABLE_HSA): Removed.
	* configure.ac: Removed handling configuration for hsa offloading.
	(ENABLE_HSA): Removed.
	* configure: Regenerated.
	* doc/install.texi (--enable-offload-targets): Remove hsa from the
	example.
	(--with-hsa-runtime): Reword to reference any HSA run-time, not
	specifically HSA offloading.
	* doc/invoke.texi (Option Summary): Remove -Whsa.
	(Warning Options): Likewise.
	(Optimize Options): Remove hsa-gen-debug-stores.
	* doc/passes.texi (Regular IPA passes): Remove section on IPA HSA
	pass.
	* gimple-low.c (lower_stmt): Remove GIMPLE_OMP_GRID_BODY case.
	* gimple-pretty-print.c (dump_gimple_omp_for): Likewise.
	(dump_gimple_omp_block): Likewise.
	(pp_gimple_stmt_1): Likewise.
	* gimple-walk.c (walk_gimple_stmt): Likewise.
	* gimple.c (gimple_build_omp_grid_body): Removed function.
	(gimple_copy): Remove GIMPLE_OMP_GRID_BODY case.
	* gimple.def (GIMPLE_OMP_GRID_BODY): Removed.
	* gimple.h (gf_mask): Removed GF_OMP_PARALLEL_GRID_PHONY,
	OMP_FOR_KIND_GRID_LOOP, GF_OMP_FOR_GRID_PHONY,
	GF_OMP_FOR_GRID_INTRA_GROUP, GF_OMP_FOR_GRID_GROUP_ITER and
	GF_OMP_TEAMS_GRID_PHONY.  Renumbered GF_OMP_FOR_KIND_SIMD and
	GF_OMP_TEAMS_HOST.
	(gimple_build_omp_grid_body): Removed declaration.
	(gimple_has_substatements): Remove GIMPLE_OMP_GRID_BODY case.
	(gimple_omp_for_grid_phony): Removed.
	(gimple_omp_for_set_grid_phony): Likewise.
	(gimple_omp_for_grid_intra_group): Likewise.
	(gimple_omp_for_grid_intra_group): Likewise.
	(gimple_omp_for_grid_group_iter): Likewise.
	(gimple_omp_for_set_grid_group_iter): Likewise.
	(gimple_omp_parallel_grid_phony): Likewise.
	(gimple_omp_parallel_set_grid_phony): Likewise.
	(gimple_omp_teams_grid_phony): Likewise.
	(gimple_omp_teams_set_grid_phony): Likewise.
	(CASE_GIMPLE_OMP): Remove GIMPLE_OMP_GRID_BODY case.
	* lto-section-in.c (lto_section_name): Removed hsa.
	* lto-streamer.h (lto_section_type): Removed LTO_section_ipa_hsa.
	* lto-wrapper.c (compile_images_for_offload_targets): Remove special
	handling of hsa.
	* omp-expand.c: Do not include hsa-common.h and gt-omp-expand.h.
	(parallel_needs_hsa_kernel_p): Removed.
	(grid_launch_attributes_trees): Likewise.
	(grid_launch_attributes_trees): Likewise.
	(grid_create_kernel_launch_attr_types): Likewise.
	(grid_insert_store_range_dim): Likewise.
	(grid_get_kernel_launch_attributes): Likewise.
	(get_target_arguments): Remove code passing HSA grid sizes.
	(grid_expand_omp_for_loop): Remove.
	(grid_arg_decl_map): Likewise.
	(grid_remap_kernel_arg_accesses): Likewise.
	(grid_expand_target_grid_body): Likewise.
	(expand_omp): Remove call to grid_expand_target_grid_body.
	(omp_make_gimple_edges): Remove GIMPLE_OMP_GRID_BODY case.
	* omp-general.c: Do not include hsa-common.h.
	(omp_maybe_offloaded): Do not check for HSA offloading.
	(omp_context_selector_matches): Likewise.
	* omp-low.c: Do not include hsa-common.h and omp-grid.h.
	(build_outer_var_ref): Remove handling of GIMPLE_OMP_GRID_BODY.
	(scan_sharing_clauses): Remove handling of OMP_CLAUSE__GRIDDIM_.
	(scan_omp_parallel): Remove handling of the phoney variant.
	(check_omp_nesting_restrictions): Remove handling of
	GIMPLE_OMP_GRID_BODY and GF_OMP_FOR_KIND_GRID_LOOP.
	(scan_omp_1_stmt): Remove handling of GIMPLE_OMP_GRID_BODY.
	(lower_omp_for_lastprivate): Remove handling of gridified loops.
	(lower_omp_for): Remove phony loop handling.
	(lower_omp_taskreg): Remove phony construct handling.
	(lower_omp_teams): Likewise.
	(lower_omp_grid_body): Removed.
	(lower_omp_1): Remove GIMPLE_OMP_GRID_BODY case.
	(execute_lower_omp): Do not call omp_grid_gridify_all_targets.
	* opts.c (common_handle_option): Do not handle hsa when processing
	OPT_foffload_.
	* params.opt (hsa-gen-debug-stores): Remove.
	* passes.def: Remove pass_ipa_hsa and pass_gen_hsail.
	* timevar.def: Remove TV_IPA_HSA.
	* toplev.c: Do not include hsa-common.h.
	(compile_file): Do not call hsa_output_brig.
	* tree-core.h (enum omp_clause_code): Remove OMP_CLAUSE__GRIDDIM_.
	(tree_omp_clause): Remove union field dimension.
	* tree-nested.c (convert_nonlocal_omp_clauses): Remove the
	OMP_CLAUSE__GRIDDIM_ case.
	(convert_local_omp_clauses): Likewise.
	* tree-pass.h (make_pass_gen_hsail): Remove declaration.
	(make_pass_ipa_hsa): Likewise.
	* tree-pretty-print.c (dump_omp_clause): Remove GIMPLE_OMP_GRID_BODY
	case.
	* tree.c (omp_clause_num_ops): Remove the element corresponding to
	OMP_CLAUSE__GRIDDIM_.
	(omp_clause_code_name): Likewise.
	(walk_tree_1): Remove GIMPLE_OMP_GRID_BODY case.
	* tree.h (OMP_CLAUSE__GRIDDIM__DIMENSION): Remove.
	(OMP_CLAUSE__GRIDDIM__SIZE): Likewise.
	(OMP_CLAUSE__GRIDDIM__GROUP): Likewise.

gcc/fortran/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* f95-lang.c (gfc_init_builtin_functions): Remove processing of
	hsa-builtins.def.

gcc/brig/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* brigfrontend/brig-util.h (hsa_type_packed_p): Declared.
	* brigfrontend/brig-util.cc (hsa_type_packed_p): Moved here from
	removed gcc/hsa-common.c.

libgomp/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* plugin/Makefrag.am: Remove configuration of HSA plugin.
	* aclocal.m4: Regenerated.
	* Makefile.in: Regenerated.
	* config.h.in: Regenerated.
	* configure: Regenerated.
	* plugin/configfrag.ac: Likewise.
	* plugin/hsa_ext_finalize.h: Removed.
	* plugin/plugin-hsa.c: Likewise.
	* testsuite/Makefile.in: Regenerated.
	* testsuite/lib/libgomp.exp
	(offload_target_to_openacc_device_type): Remove hsa case.
	(check_effective_target_hsa_offloading_selected_nocache): Removed
	(check_effective_target_hsa_offloading_selected): Likewise.
	(libgomp_init): Do not add -Wno-hsa to additional_flags.
	* testsuite/libgomp.hsa.c/alloca-1.c: Removed test.
	* testsuite/libgomp.hsa.c/bitfield-1.c: Likewise.
	* testsuite/libgomp.hsa.c/bits-insns.c: Likewise.
	* testsuite/libgomp.hsa.c/builtins-1.c: Likewise.
	* testsuite/libgomp.hsa.c/c.exp: Likewise.
	* testsuite/libgomp.hsa.c/complex-1.c: Likewise.
	* testsuite/libgomp.hsa.c/complex-align-2.c: Likewise.
	* testsuite/libgomp.hsa.c/formal-actual-args-1.c: Likewise.
	* testsuite/libgomp.hsa.c/function-call-1.c: Likewise.
	* testsuite/libgomp.hsa.c/get-level-1.c: Likewise.
	* testsuite/libgomp.hsa.c/gridify-1.c: Likewise.
	* testsuite/libgomp.hsa.c/gridify-2.c: Likewise.
	* testsuite/libgomp.hsa.c/gridify-3.c: Likewise.
	* testsuite/libgomp.hsa.c/gridify-4.c: Likewise.
	* testsuite/libgomp.hsa.c/memory-operations-1.c: Likewise.
	* testsuite/libgomp.hsa.c/pr69568.c: Likewise.
	* testsuite/libgomp.hsa.c/pr82416.c: Likewise.
	* testsuite/libgomp.hsa.c/rotate-1.c: Likewise.
	* testsuite/libgomp.hsa.c/staticvar.c: Likewise.
	* testsuite/libgomp.hsa.c/switch-1.c: Likewise.
	* testsuite/libgomp.hsa.c/switch-branch-1.c: Likewise.
	* testsuite/libgomp.hsa.c/switch-sbr-2.c: Likewise.
	* testsuite/libgomp.hsa.c/tiling-1.c: Likewise.
	* testsuite/libgomp.hsa.c/tiling-2.c: Likewise.

gcc/testsuite/ChangeLog:

2020-07-24  Martin Jambor  <mjambor@suse.cz>

	* lib/target-supports.exp (check_effective_target_offload_hsa):
	Removed.
	* c-c++-common/gomp/gridify-1.c: Removed test.
	* c-c++-common/gomp/gridify-2.c: Likewise.
	* c-c++-common/gomp/gridify-3.c: Likewise.
	* c-c++-common/gomp/hsa-indirect-call-1.c: Likewise.
	* gfortran.dg/gomp/gridify-1.f90: Likewise.
	* gcc.dg/gomp/gomp.exp: Do not pass -Wno-hsa to tests.
	* g++.dg/gomp/gomp.exp: Likewise.
	* gfortran.dg/gomp/gomp.exp: Likewise.
2020-08-03 18:13:00 +02:00
Andrew Stubbs f062c3f115 amdgcn: Switch to HSACO v3 binary format
This upgrades the compiler to emit HSA Code Object v3 binaries.  This means
changing the assembler directives, and linker command line options.

The gcn-run and libgomp loaders need corresponding alterations.  The
relocations no longer need to be fixed up manually, and the kernel symbol
names have changed slightly.

This move makes the binaries compatible with the new rocgdb from ROCm 3.5.

2020-06-17  Andrew Stubbs  <ams@codesourcery.com>

	gcc/
	* config/gcn/gcn-hsa.h (TEXT_SECTION_ASM_OP): Use ".text".
	(BSS_SECTION_ASM_OP): Use ".bss".
	(ASM_SPEC): Remove "-mattr=-code-object-v3".
	(LINK_SPEC): Add "--export-dynamic".
	* config/gcn/gcn-opts.h (processor_type): Replace PROCESSOR_VEGA with
	PROCESSOR_VEGA10 and PROCESSOR_VEGA20.
	* config/gcn/gcn-run.c (HSA_RUNTIME_LIB): Use ".so.1" variant.
	(load_image): Remove obsolete relocation handling.
	Add ".kd" suffix to the symbol names.
	* config/gcn/gcn.c (MAX_NORMAL_SGPR_COUNT): Set to 62.
	(gcn_option_override): Update gcn_isa test.
	(gcn_kernel_arg_types): Update all the assembler directives.
	Remove the obsolete options.
	(gcn_conditional_register_usage): Update MAX_NORMAL_SGPR_COUNT usage.
	(gcn_omp_device_kind_arch_isa): Handle PROCESSOR_VEGA10 and
	PROCESSOR_VEGA20.
	(output_file_start): Rework assembler file header.
	(gcn_hsa_declare_function_name): Rework kernel metadata.
	* config/gcn/gcn.h (GCN_KERNEL_ARG_TYPES): Set to 16.
	* config/gcn/gcn.opt (PROCESSOR_VEGA): Remove enum.
	(PROCESSOR_VEGA10): New enum value.
	(PROCESSOR_VEGA20): New enum value.

	libgomp/
	* plugin/plugin-gcn.c (init_environment_variables): Use ".so.1"
	variant for HSA_RUNTIME_LIB name.
	(find_executable_symbol_1): Delete.
	(find_executable_symbol): Delete.
	(init_kernel_properties): Add ".kd" suffix to symbol names.
	(find_load_offset): Delete.
	(create_and_finalize_hsa_program): Remove relocation handling.
2020-06-17 10:06:21 +01:00
Andrew Stubbs 966de09be9 amdgcn: Check HSA return codes [PR94629]
Ensure that the returned status values are not ignored.  The old code was
not broken, but this is both safer and satisfies static analysis.

2020-04-23  Andrew Stubbs  <ams@codesourcery.com>

	PR other/94629

	libgomp/
	* plugin/plugin-gcn.c (init_hsa_context): Check return value from
	hsa_iterate_agents.
	(GOMP_OFFLOAD_init_device): Check return values from both calls to
	hsa_agent_iterate_regions.
2020-04-23 15:05:23 +01:00
Frederik Harwath 001ab12e62 openmp: ignore nowait if async execution is unsupported [PR93481]
An OpenMP "nowait" clause on a target construct currently leads to
a call to GOMP_OFFLOAD_async_run in the plugin that is used for
offloading at execution time. The nvptx plugin contains only a stub
of this function that always produces a fatal error if called.

This commit changes the "nowait" implementation to ignore the clause
if the executing device's plugin does not implement GOMP_OFFLOAD_async_run.
The stub in the nvptx plugin is removed which effectively means that
programs containing "nowait" can now be executed with nvptx offloading
as if the clause had not been used.
This behavior is consistent with the OpenMP specification which says that
"[...] execution of the target task *may* be deferred" (emphasis added),
cf. OpenMP 5.0, page 172.

libgomp/

	* plugin/plugin-nvptx.c: Remove GOMP_OFFLOAD_async_run stub.
	* target.c (gomp_load_plugin_for_device): Make "async_run" loading
	optional.
	(gomp_target_task_fn): Assert "devicep->async_run_func".
	(clear_unsupported_flags): New function to remove unsupported flags
	(right now only GOMP_TARGET_FLAG_NOWAIT) that can be be ignored.
	(GOMP_target_ext): Apply clear_unsupported_flags to flags.
	* testsuite/libgomp.c/target-33.c:
	Remove xfail for offload_target_nvptx.
	* testsuite/libgomp.c/target-34.c: Likewise.
2020-02-13 10:18:31 +01:00
Andrew Stubbs 591f869ad7 Remove gfx801 "carrizo" support
2020-02-03  Andrew Stubbs  <ams@codesourcery.com>

	gcc/
	* config.gcc: Remove "carrizo" support.
	* config/gcn/gcn-opts.h (processor_type): Likewise.
	* config/gcn/gcn.c (gcn_omp_device_kind_arch_isa): Likewise.
	* config/gcn/gcn.opt (gpu_type): Likewise.
	* config/gcn/t-omp-device: Likewise.

	libgomp/
	* plugin/plugin-gcn.c (EF_AMDGPU_MACH_AMDGCN_GFX801): Remove.
	(gcn_gfx801_s): Remove.
	(isa_hsa_name): Remove gfx801.
	(isa_gcc_name): Remove gfx801/carizzo.
	(isa_code): Remove gfx801.
2020-02-03 17:23:18 +00:00
Kwok Cheung Yeung 5a28e2727f [amdgcn] Scale number of threads/workers with VGPR usage
2020-01-31  Kwok Cheung Yeung  <kcy@codesourcery.com>

	gcc/
	* config/gcn/mkoffload.c (process_asm): Add sgpr_count and vgpr_count
	to definition of hsa_kernel_description.  Parse assembly to find SGPR
	and VGPR count of kernel and store in hsa_kernel_description.

	libgomp/
	* plugin/plugin-gcn.c (struct hsa_kernel_description): Add sgpr_count
	and vgpr_count fields.
	(struct kernel_info): Add a field for a hsa_kernel_description.
	(run_kernel): Reduce the number of threads/workers if the requested
	number would require too many VGPRs.
	(init_basic_kernel_info): Initialize description field with
	the hsa_kernel_description entry for the kernel.
2020-01-31 07:13:05 -08:00
Tobias Burnus 5ab5d81b36 Skip plugin-{gcn,hsa} for (-m)x32 (PR bootstrap/93409)
PR bootstrap/93409
        * plugin/configfrag.ac (enable_offload_targets): Skip
        HSA and GCN plugin besides -m32 also for -mx32.
        * configure: Regenerate.
2020-01-30 12:27:17 +01:00
Frederik Harwath 2e5ea57959 Add OpenACC acc_get_property support for AMD GCN
Add full support for the OpenACC 2.6 acc_get_property and
acc_get_property_string functions to the libgomp GCN plugin.

libgomp/
	* plugin-gcn.c (struct agent_info): Add fields "name" and
	"vendor_name" ...
	(GOMP_OFFLOAD_init_device): ... and init from here.
	(struct hsa_context_info): Add field "driver_version_s" ...
	(init_hsa_contest): ... and init from here.
	(GOMP_OFFLOAD_openacc_get_property): Replace stub with a proper
	implementation.
	* testsuite/libgomp.oacc-c-c++-common/acc_get_property.c:
	Enable test execution for amdgcn and host offloading targets.
	* testsuite/libgomp.oacc-fortran/acc_get_property.f90: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/acc_get_property-aux.c
	(expect_device_properties): Split function into ...
	(expect_device_string_properties): ... this new function ...
	(expect_device_memory): ... and this new function.
	* testsuite/libgomp.oacc-c-c++-common/acc_get_property-gcn.c:
	Add test.
2020-01-29 11:54:56 +01:00
Andrew Stubbs 14e5e74698 Fix libgomp plugin-gcn bug
2020-01-23  Andrew Stubbs  <ams@codesourcery.com>

	libgomp/
	* plugin/plugin-gcn.c (parse_target_attributes): Use correct mask for
	the device id.
2020-01-23 12:40:04 +00:00
Frederik Harwath 7d593fd672 Add runtime ISA check for amdgcn offloading
The HSA/ROCm runtime rejects binaries not built for the exact GPU device
present. So far, the libgomp amdgcn plugin does not verify that the GPU ISA
and the ISA specified at compile time match before handing over the binary to
the runtime.  In case of a mismatch, the user is confronted with an unhelpful
runtime error.

This commit implements a runtime ISA check. In case of an ISA mismatch, the
execution is aborted with a clear error message and a hint at the correct
compilation parameters for the GPU on which the execution has been attempted.

libgomp/
	* plugin/plugin-gcn.c (EF_AMDGPU_MACH): New enum.
	* (EF_AMDGPU_MACH_MASK): New constant.
	* (gcn_isa): New typedef.
	* (gcn_gfx801_s): New constant.
	* (gcn_gfx803_s): New constant.
	* (gcn_gfx900_s): New constant.
	* (gcn_gfx906_s): New constant.
	* (gcn_isa_name_len): New constant.
	* (elf_gcn_isa_field): New function.
	* (isa_hsa_name): New function.
	* (isa_gcc_name): New function.
	* (isa_code): New function.
	* (struct agent_info): Add field "device_isa" and remove field
	"gfx900_p".
	* (GOMP_OFFLOAD_init_device): Adapt agent init to "agent_info"
	field changes, fail if device has unknown ISA.
	* (parse_target_attributes): Replace "gfx900_p" by "device_isa".
	* (isa_matches_agent): New function ...
	* (create_and_finalize_hsa_program): ... used from here to check
	that the GPU ISA and the code-object ISA match.
2020-01-21 07:41:45 +01:00
Thomas Schwinge 6fc0385c0c OpenACC 'acc_get_property' cleanup
include/
	* gomp-constants.h (enum gomp_device_property): Remove.
	libgomp/
	* libgomp-plugin.h (enum goacc_property): New.  Adjust all users
	to use this instead of 'enum gomp_device_property'.
	(GOMP_OFFLOAD_get_property): Rename to...
	(GOMP_OFFLOAD_openacc_get_property): ... this.  Adjust all users.
	* libgomp.h (struct gomp_device_descr): Move
	'GOMP_OFFLOAD_openacc_get_property'...
	(struct acc_dispatch_t): ... here.  Adjust all users.
	* plugin/plugin-hsa.c (GOMP_OFFLOAD_get_property): Remove.
	liboffloadmic/
	* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_get_property):
	Remove.

From-SVN: r280150
2020-01-10 23:24:36 +01:00
Jakub Jelinek 8d9254fc8a Update copyright years.
From-SVN: r279813
2020-01-01 12:51:42 +01:00
Jakub Jelinek e9dcb75e40 re PR bootstrap/93074 (build FAIL with --enable-offload-targets=nvptx-none)
PR bootstrap/93074
	* plugin/cuda/cuda.h (cuDeviceGetName, cuDriverGetVersion): Declare.
	(cuDeviceTotalMem, cuMemGetInfo): Likewise.  Define to *_v2.

From-SVN: r279747
2019-12-28 10:26:03 +01:00
Maciej W. Rozycki 6c84c8bf9b Add OpenACC 2.6 `acc_get_property' support
Add generic support for the OpenACC 2.6 `acc_get_property' and
`acc_get_property_string' routines, as well as full handlers for the
host and the NVPTX offload targets and minimal handlers for the HSA,
Intel MIC, and AMD GCN offload targets.

Included are C/C++ and Fortran tests that, in particular, print
the property values for acc_property_vendor, acc_property_memory,
acc_property_free_memory, acc_property_name, and acc_property_driver.
The output looks as follows:

Vendor: GNU
Name: GOMP
Total memory: 0
Free memory: 0
Driver: 1.0

with the host driver (where the memory related properties are not
supported for the host device and yield 0, conforming to the standard)
and output like:

Vendor: Nvidia
Total memory: 12651462656
Free memory: 12202737664
Name: TITAN V
Driver: CUDA Driver 9.1

with the NVPTX driver.

2019-12-22  Maciej W. Rozycki  <macro@codesourcery.com>
	    Frederik Harwath  <frederik@codesourcery.com>
	    Thomas Schwinge  <tschwinge@codesourcery.com>

	include/
	* gomp-constants.h (gomp_device_property): New enum.

	libgomp/
	* libgomp.h (gomp_device_descr): Add `get_property_func' member.
	* libgomp-plugin.h (gomp_device_property_value): New union.
	(gomp_device_property_value): New prototype.
	* openacc.h (acc_device_t): Add `acc_device_current' enumeration
	constant.
	(acc_device_property_t): New enum.
	(acc_get_property, acc_get_property_string): New prototypes.
	* oacc-init.c (acc_get_device_type): Also assert that result
	is not `acc_device_current'.
	(get_property_any, acc_get_property, acc_get_property_string):
	New functions.
	* openacc.f90 (openacc_kinds): Add `acc_device_current' and
	`acc_property_memory', `acc_property_free_memory',
	`acc_property_name', `acc_property_vendor' and
	`acc_property_driver' constants.  Add `acc_device_property' data
	type.
	(openacc_internal): Add `acc_get_property' and
	`acc_get_property_string' interfaces.  Add `acc_get_property_h',
	`acc_get_property_string_h', `acc_get_property_l' and
	`acc_get_property_string_l'.
	* oacc-host.c (host_get_property): New function.
	(host_dispatch): Wire it.
	* target.c (gomp_load_plugin_for_device): Handle `get_property'.
	* libgomp.map (OACC_2.6): Add `acc_get_property', `acc_get_property_h_',
	`acc_get_property_string' and `acc_get_property_string_h_' symbols.
	* libgomp.texi (OpenACC Runtime Library Routines): Add
	`acc_get_property'.
	(acc_get_property): New node.
	* plugin/plugin-gcn.c (GOMP_OFFLOAD_get_property): New
	function (stub).
	* plugin/plugin-hsa.c (GOMP_OFFLOAD_get_property): New function.
	* plugin/plugin-nvptx.c (CUDA_CALLS): Add `cuDeviceGetName',
	`cuDeviceTotalMem', `cuDriverGetVersion' and `cuMemGetInfo'
	calls.
	(GOMP_OFFLOAD_get_property): New function.
	(struct ptx_device): Add new field "name".
	(cuda_driver_version_s): Add new static variable ...
	(nvptx_init): ... and init from here.

	* testsuite/libgomp.oacc-c-c++-common/acc_get_property.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/acc_get_property-2.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/acc_get_property-3.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/acc_get_property-aux.c: New file
	with test helper functions.

	* testsuite/libgomp.oacc-fortran/acc_get_property.f90: New test.

	liboffloadmic/
	* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_get_property):
	New function.

Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>


Co-Authored-By: Frederik Harwath <frederik@codesourcery.com>
Co-Authored-By: Thomas Schwinge <tschwinge@codesourcery.com>

From-SVN: r279710
2019-12-22 19:54:09 +00:00
Tobias Burnus 93d9021987 libgomp – spelling fixes, incl. omp_lib.h.in
* omp_lib.h.in: Fix spelling of function declaration
        omp_get_cancell(l)ation.
        * libgomp.texi (acc_is_present, acc_async_test, acc_async_test_all):
        Fix typos.
        * env.c: Fix comment typos.
        * oacc-host.c: Likewise.
        * ordered.c: Likewise.
        * task.c: Likewise.
        * team.c: Likewise.
        * config/gcn/task.c: Likewise.
        * config/gcn/team.c: Likewise.
        * config/nvptx/task.c: Likewise.
        * config/nvptx/team.c: Likewise.
        * plugin/plugin-gcn.c: Likewise.
        * testsuite/libgomp.fortran/jacobi.f: Likewise.
        * testsuite/libgomp.hsa.c/tiling-2.c: Likewise.
        * testsuite/libgomp.oacc-c-c++-common/enter_exit-lib.c: Likewise.

From-SVN: r279218
2019-12-11 12:45:49 +01:00
Julian Brown d88b27daa1 AMD GCN libgomp plugin queue-full condition locking fix
libgomp/
	* plugin/plugin-gcn.c (wait_for_queue_nonfull): Don't lock/unlock
	aq->mutex here.
	(queue_push_launch): Lock aq->mutex before calling
	wait_for_queue_nonfull.
	(queue_push_callback): Likewise.
	(queue_push_asyncwait): Likewise.
	(queue_push_placeholder): Likewise.

Reviewed-by: Andrew Stubbs <ams@codesourcery.com>

From-SVN: r278517
2019-11-20 17:56:30 +00:00
Julian Brown 8d2f4ddfd7 Fix host-to-device copies from rodata for AMD GCN
libgomp/
	* plugin/plugin-gcn.c (hsa_memory_copy_wrapper): New.
	(copy_data, GOMP_OFFLOAD_host2dev): Use above function.
	(GOMP_OFFLOAD_dev2host, GOMP_OFFLOAD_dev2dev): Check hsa_memory_copy
	return code.

Reviewed-by: Andrew Stubbs <ams@codesourcery.com>

From-SVN: r278516
2019-11-20 17:53:31 +00:00
Andrew Stubbs 237957cc2c GCN Libgomp Plugin
2019-11-13  Andrew Stubbs  <ams@codesourcery.com>
	    Kwok Cheung Yeung  <kcy@codesourcery.com>
	    Julian Brown  <julian@codesourcery.com>
	    Tom de Vries  <tom@codesourcery.com>

	libgomp/
	* plugin/Makefrag.am: Add amdgcn plugin support.
	* plugin/configfrag.ac: Likewise.
	* plugin/plugin-gcn.c: New file.
	* configure: Regenerate.
	* Makefile.in: Regenerate.
	* testsuite/Makefile.in: Regenerate.

Co-Authored-By: Julian Brown <julian@codesourcery.com>
Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
Co-Authored-By: Tom de Vries <tom@codesourcery.com>

From-SVN: r278138
2019-11-13 12:38:18 +00:00
Andrew Stubbs d2903ce05b Add device number to GOMP_OFFLOAD_openacc_async_construct
2019-11-13  Andrew Stubbs  <ams@codesourcery.com>
	    Julian Brown  <julian@codesourcery.com>

	libgomp/
	* libgomp-plugin.h (GOMP_OFFLOAD_openacc_async_construct): Add int
	parameter.
	* oacc-async.c (lookup_goacc_asyncqueue): Pass device number to the
	queue constructor.
	* oacc-host.c (host_openacc_async_construct): Add device parameter.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_openacc_async_construct): Add
	device parameter.

Co-Authored-By: Julian Brown <julian@codesourcery.com>

From-SVN: r278134
2019-11-13 12:37:59 +00:00
Jakub Jelinek 810f316dd6 configure.ac: Remove GCC_HEADER_STDINT(gstdint.h).
* configure.ac: Remove GCC_HEADER_STDINT(gstdint.h).
	* libgomp.h: Include <stdint.h> instead of "gstdint.h".
	* oacc-parallel.c: Don't include "libgomp_g.h".
	* plugin/plugin-hsa.c: Include <stdint.h> instead of "gstdint.h".
	* plugin/plugin-nvptx.c: Don't include "gstdint.h".
	* aclocal.m4: Regenerated.
	* config.h.in: Regenerated.
	* configure: Regenerated.
	* Makefile.in: Regenerated.

From-SVN: r276389
2019-10-01 09:51:46 +02:00
Tobias Burnus c28712beb4 libgomp plugin - init string
libgomp/
2019-09-13  Tobias Burnus  <tobias@codesourcery.com>

        * plugin/plugin-hsa.c (hsa_warn, hsa_fatal, hsa_error): Ensure
        string is initialized.

From-SVN: r275703
2019-09-13 20:14:02 +02:00
Jakub Jelinek b5c26449f3 re PR libgomp/90585 (libgomp hsa plugin ftbfs in the x32 multilib variant)
PR libgomp/90585
	* plugin/plugin-hsa.c: Include gstdint.h.  Include inttypes.h only if
	HAVE_INTTYPES_H is defined.
	(print_uint64_t): New typedef.
	(PRIu64): Define if HAVE_INTTYPES_H is not defined.
	(print_kernel_dispatch, run_kernel): Use PRIu64 macro instead of
	"lu", cast uint64_t HSA_DEBUG and fprintf arguments to print_uint64_t.
	(release_kernel_dispatch): Likewise.  Cast shadow->debug to uintptr_t
	before casting to void *.
	* plugin/plugin-nvptx.c: Include gstdint.h instead of stdint.h.
	* oacc-mem.c: Don't include config.h nor stdint.h.
	* target.c: Don't include config.h.
	* oacc-cuda.c: Likewise.
	* oacc-host.c: Don't include stdint.h.

From-SVN: r271597
2019-05-24 10:59:37 +02:00
Thomas Schwinge 5fae049dc2 OpenACC Profiling Interface (incomplete)
libgomp/
	* acc_prof.h: New file.
	* oacc-profiling.c: Likewise.
	* Makefile.am (nodist_libsubinclude_HEADERS, libgomp_la_SOURCES):
	Add these, respectively.
	* Makefile.in: Regenerate.
	* env.c (initialize_env): Call goacc_profiling_initialize.
	* oacc-plugin.c (GOMP_PLUGIN_goacc_thread)
	(GOMP_PLUGIN_goacc_profiling_dispatch): New functions.
	* oacc-plugin.h (GOMP_PLUGIN_goacc_thread)
	(GOMP_PLUGIN_goacc_profiling_dispatch): Declare.
	* libgomp.map (OACC_2.5.1): Add acc_prof_lookup,
	acc_prof_register, acc_prof_unregister, and acc_register_library.
	(GOMP_PLUGIN_1.3): Add GOMP_PLUGIN_goacc_profiling_dispatch, and
	GOMP_PLUGIN_goacc_thread.
	* oacc-int.h (struct goacc_thread): Add prof_info, api_info,
	prof_callbacks_enabled members.
	(goacc_prof_enabled, goacc_profiling_initialize)
	(_goacc_profiling_dispatch_p, _goacc_profiling_setup_p)
	(goacc_profiling_dispatch): Declare.
	(GOACC_PROF_ENABLED, GOACC_PROFILING_DISPATCH_P)
	(GOACC_PROFILING_SETUP_P): Define.
	* oacc-async.c (acc_async_test, acc_async_test_all, acc_wait)
	(acc_wait_async, acc_wait_all, acc_wait_all_async): Update for
	OpenACC Profiling Interface.
	* oacc-cuda.c (acc_get_current_cuda_device)
	(acc_get_current_cuda_context, acc_get_cuda_stream)
	(acc_set_cuda_stream): Likewise.
	* oacc-init.c (acc_init_1, goacc_attach_host_thread_to_device)
	(acc_init, acc_set_device_type, acc_get_device_type)
	(acc_get_device_num, goacc_lazy_initialize): Likewise.
	* oacc-mem.c (acc_malloc, acc_free, memcpy_tofrom_device)
	(acc_deviceptr, acc_hostptr, acc_is_present, acc_map_data)
	(acc_unmap_data, present_create_copy, delete_copyout)
	(update_dev_host): Likewise.
	* oacc-parallel.c (GOACC_parallel_keyed, GOACC_data_start)
	(GOACC_data_end, GOACC_enter_exit_data, GOACC_update, GOACC_wait):
	Likewise.
	* plugin/plugin-nvptx.c (nvptx_exec, nvptx_alloc, nvptx_free)
	(GOMP_OFFLOAD_openacc_exec, GOMP_OFFLOAD_openacc_async_exec):
	Likewise.
	* libgomp.texi: Update.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-dispatch-1.c: New
	file.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-init-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-kernels-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-parallel-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-valid_bytes-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/acc_prof-version-1.c:
	Likewise.

From-SVN: r271346
2019-05-17 21:13:36 +02:00
Chung-Lin Tang 1f4c5b9bb2 2019-05-13 Chung-Lin Tang <cltang@codesourcery.com>
Reviewed-by: Thomas Schwinge <thomas@codesourcery.com>

	libgomp/
	* libgomp-plugin.h (struct goacc_asyncqueue): Declare.
	(struct goacc_asyncqueue_list): Likewise.
	(goacc_aq): Likewise.
	(goacc_aq_list): Likewise.
	(GOMP_OFFLOAD_openacc_register_async_cleanup): Remove.
	(GOMP_OFFLOAD_openacc_async_test): Remove.
	(GOMP_OFFLOAD_openacc_async_test_all): Remove.
	(GOMP_OFFLOAD_openacc_async_wait): Remove.
	(GOMP_OFFLOAD_openacc_async_wait_async): Remove.
	(GOMP_OFFLOAD_openacc_async_wait_all): Remove.
	(GOMP_OFFLOAD_openacc_async_wait_all_async): Remove.
	(GOMP_OFFLOAD_openacc_async_set_async): Remove.
	(GOMP_OFFLOAD_openacc_exec): Adjust declaration.
	(GOMP_OFFLOAD_openacc_cuda_get_stream): Likewise.
	(GOMP_OFFLOAD_openacc_cuda_set_stream): Likewise.
	(GOMP_OFFLOAD_openacc_async_exec): Declare.
	(GOMP_OFFLOAD_openacc_async_construct): Declare.
	(GOMP_OFFLOAD_openacc_async_destruct): Declare.
	(GOMP_OFFLOAD_openacc_async_test): Declare.
	(GOMP_OFFLOAD_openacc_async_synchronize): Declare.
	(GOMP_OFFLOAD_openacc_async_serialize): Declare.
	(GOMP_OFFLOAD_openacc_async_queue_callback): Declare.
	(GOMP_OFFLOAD_openacc_async_host2dev): Declare.
	(GOMP_OFFLOAD_openacc_async_dev2host): Declare.

	* libgomp.h (struct acc_dispatch_t): Define 'async' sub-struct.
	(gomp_acc_insert_pointer): Adjust declaration.
	(gomp_copy_host2dev): New declaration.
	(gomp_copy_dev2host): Likewise.
	(gomp_map_vars_async): Likewise.
	(gomp_unmap_tgt): Likewise.
	(gomp_unmap_vars_async): Likewise.
	(gomp_fini_device): Likewise.

	* oacc-async.c (get_goacc_thread): New function.
	(get_goacc_thread_device): New function.
	(lookup_goacc_asyncqueue): New function.
	(get_goacc_asyncqueue): New function.
	(acc_async_test): Adjust code to use new async design.
	(acc_async_test_all): Likewise.
	(acc_wait): Likewise.
	(acc_wait_async): Likewise.
	(acc_wait_all): Likewise.
	(acc_wait_all_async): Likewise.
	(goacc_async_free): New function.
	(goacc_init_asyncqueues): Likewise.
	(goacc_fini_asyncqueues): Likewise.
	* oacc-cuda.c (acc_get_cuda_stream): Adjust code to use new async
	design.
	(acc_set_cuda_stream): Likewise.
	* oacc-host.c (host_openacc_exec): Adjust parameters, remove 'async'.
	(host_openacc_register_async_cleanup): Remove.
	(host_openacc_async_exec): New function.
	(host_openacc_async_test): Adjust parameters.
	(host_openacc_async_test_all): Remove.
	(host_openacc_async_wait): Remove.
	(host_openacc_async_wait_async): Remove.
	(host_openacc_async_wait_all): Remove.
	(host_openacc_async_wait_all_async): Remove.
	(host_openacc_async_set_async): Remove.
	(host_openacc_async_synchronize): New function.
	(host_openacc_async_serialize): New function.
	(host_openacc_async_host2dev): New function.
	(host_openacc_async_dev2host): New function.
	(host_openacc_async_queue_callback): New function.
	(host_openacc_async_construct): New function.
	(host_openacc_async_destruct): New function.
	(struct gomp_device_descr host_dispatch): Remove initialization of old
	interface, add intialization of new async sub-struct.
	* oacc-init.c (acc_shutdown_1): Adjust to use gomp_fini_device.
	(goacc_attach_host_thread_to_device): Remove old async code usage.
	* oacc-int.h (goacc_init_asyncqueues): New declaration.
	(goacc_fini_asyncqueues): Likewise.
	(goacc_async_copyout_unmap_vars): Likewise.
	(goacc_async_free): Likewise.
	(get_goacc_asyncqueue): Likewise.
	(lookup_goacc_asyncqueue): Likewise.

	* oacc-mem.c (memcpy_tofrom_device): Adjust code to use new async
	design.
	(present_create_copy): Adjust code to use new async design.
	(delete_copyout): Likewise.
	(update_dev_host): Likewise.
	(gomp_acc_insert_pointer): Add async parameter, adjust code to use new
	async design.
	(gomp_acc_remove_pointer): Adjust code to use new async design.
	* oacc-parallel.c (GOACC_parallel_keyed): Adjust code to use new async
	design.
	(GOACC_enter_exit_data): Likewise.
	(goacc_wait): Likewise.
	(GOACC_update): Likewise.
	* oacc-plugin.c (GOMP_PLUGIN_async_unmap_vars): Change to assert fail
	when called, warn as obsolete in comment.

	* target.c (goacc_device_copy_async): New function.
	(gomp_copy_host2dev): Remove 'static', add goacc_asyncqueue parameter,
	add goacc_device_copy_async case.
	(gomp_copy_dev2host): Likewise.
	(gomp_map_vars_existing): Add goacc_asyncqueue parameter, adjust code.
	(gomp_map_pointer): Likewise.
	(gomp_map_fields_existing): Likewise.
	(gomp_map_vars_internal): New always_inline function, renamed from
	gomp_map_vars.
	(gomp_map_vars): Implement by calling gomp_map_vars_internal.
	(gomp_map_vars_async): Implement by calling gomp_map_vars_internal,
	passing goacc_asyncqueue argument.
	(gomp_unmap_tgt): Remove static, add attribute_hidden.
	(gomp_unref_tgt): New function.
	(gomp_unmap_vars_internal): New always_inline function, renamed from
	gomp_unmap_vars.
	(gomp_unmap_vars): Implement by calling gomp_unmap_vars_internal.
	(gomp_unmap_vars_async): Implement by calling
	gomp_unmap_vars_internal, passing goacc_asyncqueue argument.
	(gomp_fini_device): New function.
	(gomp_exit_data): Adjust gomp_copy_dev2host call.
	(gomp_load_plugin_for_device): Remove old interface, adjust to load
	new async interface.
	(gomp_target_fini): Adjust code to call gomp_fini_device.

	* plugin/plugin-nvptx.c (struct cuda_map): Remove.
	(struct ptx_stream): Remove.
	(struct nvptx_thread): Remove current_stream field.
	(cuda_map_create): Remove.
	(cuda_map_destroy): Remove.
	(map_init): Remove.
	(map_fini): Remove.
	(map_pop): Remove.
	(map_push): Remove.
	(struct goacc_asyncqueue): Define.
	(struct nvptx_callback): Define.
	(struct ptx_free_block): Define.
	(struct ptx_device): Remove null_stream, active_streams, async_streams,
	stream_lock, and next fields.
	(enum ptx_event_type): Remove.
	(struct ptx_event): Remove.
	(ptx_event_lock): Remove.
	(ptx_events): Remove.
	(init_streams_for_device): Remove.
	(fini_streams_for_device): Remove.
	(select_stream_for_async): Remove.
	(nvptx_init): Remove ptx_events and ptx_event_lock references.
	(nvptx_attach_host_thread_to_device): Remove CUDA_ERROR_NOT_PERMITTED
	case.
	(nvptx_open_device): Add free_blocks initialization, remove
	init_streams_for_device call.
	(nvptx_close_device): Remove fini_streams_for_device call, add
	free_blocks destruct code.
	(event_gc): Remove.
	(event_add): Remove.
	(nvptx_exec): Adjust parameters and code.
	(nvptx_free): Likewise.
	(nvptx_host2dev): Remove.
	(nvptx_dev2host): Remove.
	(nvptx_set_async): Remove.
	(nvptx_async_test): Remove.
	(nvptx_async_test_all): Remove.
	(nvptx_wait): Remove.
	(nvptx_wait_async): Remove.
	(nvptx_wait_all): Remove.
	(nvptx_wait_all_async): Remove.
	(nvptx_get_cuda_stream): Remove.
	(nvptx_set_cuda_stream): Remove.
	(GOMP_OFFLOAD_alloc): Adjust code.
	(GOMP_OFFLOAD_free): Likewise.
	(GOMP_OFFLOAD_openacc_register_async_cleanup): Remove.
	(GOMP_OFFLOAD_openacc_exec): Adjust parameters and code.
	(GOMP_OFFLOAD_openacc_async_test_all): Remove.
	(GOMP_OFFLOAD_openacc_async_wait): Remove.
	(GOMP_OFFLOAD_openacc_async_wait_async): Remove.
	(GOMP_OFFLOAD_openacc_async_wait_all): Remove.
	(GOMP_OFFLOAD_openacc_async_wait_all_async): Remove.
	(GOMP_OFFLOAD_openacc_async_set_async): Remove.
	(cuda_free_argmem): New function.
	(GOMP_OFFLOAD_openacc_async_exec): New plugin hook function.
	(GOMP_OFFLOAD_openacc_create_thread_data): Adjust code.
	(GOMP_OFFLOAD_openacc_cuda_get_stream): Adjust code.
	(GOMP_OFFLOAD_openacc_cuda_set_stream): Adjust code.
	(GOMP_OFFLOAD_openacc_async_construct): New plugin hook function.
	(GOMP_OFFLOAD_openacc_async_destruct): New plugin hook function.
	(GOMP_OFFLOAD_openacc_async_test): Remove and re-implement.
	(GOMP_OFFLOAD_openacc_async_synchronize): New plugin hook function.
	(GOMP_OFFLOAD_openacc_async_serialize): New plugin hook function.
	(GOMP_OFFLOAD_openacc_async_queue_callback): New plugin hook function.
	(cuda_callback_wrapper): New function.
	(cuda_memcpy_sanity_check): New function.
	(GOMP_OFFLOAD_host2dev): Remove and re-implement.
	(GOMP_OFFLOAD_dev2host): Remove and re-implement.
	(GOMP_OFFLOAD_openacc_async_host2dev): New plugin hook function.
	(GOMP_OFFLOAD_openacc_async_dev2host): New plugin hook function.

From-SVN: r271128
2019-05-13 13:32:00 +00:00
Thomas Schwinge 0a0384b43a [libgomp] In OpenACC testing, cycle though all offload targets
... instead of through offload plugins.

	libgomp/
	* plugin/configfrag.ac: Populate and AC_SUBST offload_targets.
	* testsuite/libgomp-test-support.exp.in: Adjust.
	* testsuite/lib/libgomp.exp: Likewise.  Don't populate
	openacc_device_types_s.
	(offload_target_to_openacc_device_type): New proc.
	* testsuite/libgomp.oacc-c++/c++.exp: Adjust.
	* testsuite/libgomp.oacc-c/c.exp: Likewise.
	* testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
	* Makefile.in: Regenerate.
	* configure: Likewise.
	* testsuite/Makefile.in: Likewise.

From-SVN: r269108
2019-02-22 11:51:20 +01:00
Thomas Schwinge ee332b4a9a [libgomp] Clarify difference between offload target, offload plugin, and OpenACC device type
libgomp/
	* plugin/configfrag.ac: Populate and AC_SUBST offload_plugins
	instead of offload_targets, and AC_DEFINE_UNQUOTED OFFLOAD_PLUGINS
	instead of OFFLOAD_TARGETS.
	* target.c (gomp_target_init): Adjust.
	* testsuite/libgomp-test-support.exp.in: Likewise.
	* testsuite/lib/libgomp.exp: Likewise.  Populate
	openacc_device_types_s instead of offload_targets_s_openacc.
	(check_effective_target_openacc_nvidia_accel_selected)
	(check_effective_target_openacc_host_selected): Adjust.
	* testsuite/libgomp.oacc-c++/c++.exp: Likewise.
	* testsuite/libgomp.oacc-c/c.exp: Likewise.
	* testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
	* Makefile.in: Regenerate.
	* config.h.in: Likewise.
	* configure: Likewise.
	* testsuite/Makefile.in: Likewise.

From-SVN: r269107
2019-02-22 11:51:05 +01:00
Tom de Vries 738c56d410 [nvptx, libgomp] Fix memleak in GOMP_OFFLOAD_fini_device
I wrote a test-case:
...
int
main (void)
{
  for (unsigned i = 0; i < 128; ++i)
    {
      acc_init (acc_device_nvidia);
      acc_shutdown (acc_device_nvidia);
    }

  return 0;
}
...
and ran it under valgrind.  The only leak location reported with a frequency
of 128, was the allocation of ptx_devices in nvptx_init.

Fix this by freeing ptx_devices in GOMP_OFFLOAD_fini_device, once
instantiated_devices drops to 0.

2019-01-24  Tom de Vries  <tdevries@suse.de>

	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_fini_device): Free ptx_devices
	once instantiated_devices drops to 0.

From-SVN: r268237
2019-01-24 14:12:19 +00:00
Tom de Vries 4a75460b00 [nvptx, libgomp] Fix cuMemAlloc with size zero
Consider test-case:
...
int
main (void)
{
  #pragma acc parallel async
  ;
  #pragma acc parallel async
  ;
  #pragma acc wait

  return 0;
}
...

This fails with:
...
libgomp: cuMemAlloc error: invalid argument
Segmentation fault (core dumped)
...
The cuMemAlloc error is due to the fact that we're try to allocate 0 bytes.

Fix this by preventing calling map_push with size zero argument in nvptx_exec.

This also has the consequence that for the abort-1.c test-case, we end up
calling cuMemFree during map_fini for the struct cuda_map allocated in
map_init, which fails because an abort happened.  Fix this by calling
cuMemFree with CUDA_CALL_NOCHECK in cuda_map_destroy.

2019-01-23  Tom de Vries  <tdevries@suse.de>

	PR target/PR88946
	* plugin/plugin-nvptx.c (cuda_map_destroy): Use CUDA_CALL_NOCHECK for
	cuMemFree.
	(nvptx_exec): Don't call map_push if mapnum == 0.
	* testsuite/libgomp.oacc-c-c++-common/pr88946.c: New test.

From-SVN: r268178
2019-01-23 08:16:56 +00:00
Tom de Vries 4fef8e4d8c [nvptx, libgomp] Fix assert (!s->map->active) in map_fini
There are currently two situations where this assert triggers:
...
libgomp/plugin/plugin-nvptx.c: map_fini: Assertion `!s->map->active' failed.
...

First, in abort-1.c, a parallel region triggering an abort:
...
int
main (void)
{
  #pragma acc parallel
  abort ();

  return 0;
}
...

The abort is detected in nvptx_exec as the CUDA_ERROR_ILLEGAL_INSTRUCTION
return status of the cuStreamSynchronize call after kernel launch, which is
then handled by calling non-returning function GOMP_PLUGIN_fatal.
Consequently, the map_pop in nvptx_exec that in case of cuStreamSynchronize
success would remove or inactive the element added by the map_push earlier in
nvptx_exec, does not trigger.  With the element no longer active, but still
marked active and a member of s->map,  we run into the assert during
GOMP_OFFLOAD_fini_device, which is triggered from atexit handler
gomp_target_fini (which is triggered by the GOMP_PLUGIN_fatal mentioned above
calling exit).

Second, in pr88941.c, an async parallel region without wait:
...
int
main (void)
{
  #pragma acc parallel async
  ;

  /* no #pragma acc wait */
  return 0;
}
...

Because nvptx_exec is handling an async region, it does not call map_pop for
the element added by map_push, but schedules an kernel execution completion
event to call map_pop.  Again, we run into the assert during
GOMP_OFFLOAD_fini_device, which is triggered from atexit handler
gomp_target_fini, but the exit in this case is triggered by returning from main.
So either the kernel is still running, or the kernel has completed but the
corresponding event that is supposed to call map_pop is stuck in the event
queue, waiting for an event_gc.

Fix this by removing the assert, and skipping the freeing of device memory if
the map is still marked active (though in the async case, this is more a
workaround than an fix).

2019-01-23  Tom de Vries  <tdevries@suse.de>

	PR target/88941
	PR target/88939
	* plugin/plugin-nvptx.c (cuda_map_destroy): Handle map->active case.
	(map_fini): Remove "assert (!s->map->active)".
	* testsuite/libgomp.oacc-c-c++-common/pr88941.c: New test.

From-SVN: r268177
2019-01-23 08:16:42 +00:00
Tom de Vries 2ee6cb22c1 [nvptx, libgomp] Fix map_push
The map field of a struct ptx_stream is a FIFO.  The FIFO is implemented as a
single linked list, with pop-from-the-front semantics.

The function map_pop pops an element, either by:
- deallocating the element, if there is more than one element
- or marking the element inactive, if there's only one element

The responsibility of map_push is to push an element to the back, as well as
selecting the element to push, by:
- allocating an element, or
- reusing the element at the front if inactive and big enough, or
- dropping the element at the front if inactive and not big enough, and
  allocating one that's big enough

The current implemention gets at least the first and most basic scenario wrong:

> map = cuda_map_create (size);

We create an element, and assign it to map.

> for (t = s->map; t->next != NULL; t = t->next)
>   ;

We determine the last element in the fifo.

> t->next = map;

We append the new element.

> s->map = map;

But here, we throw away the rest of the FIFO, and declare the FIFO to be just
the new element.

This problem causes the test-case asyncwait-1.c to fail intermittently on some
systems.  The pr87835.c test-case added here is a a minimized and modified
version of asyncwait-1.c (avoiding the kernel construct) that is more likely to
fail.

Fix this by rewriting map_pop more robustly, by:
- seperating the function in two phases: select element, push element
- when reusing or dropping an element, making sure that the element is cleanly
  popped from the queue
- rewriting the push element part in such a way that it can handle all cases
  without needing if statements, such that each line is exercised for each of
  the three cases.

2019-01-23  Tom de Vries  <tdevries@suse.de>

	PR target/87835
	* plugin/plugin-nvptx.c (map_push): Fix adding of allocated element.
	* testsuite/libgomp.oacc-c-c++-common/pr87835.c: New test.

From-SVN: r268176
2019-01-23 08:16:11 +00:00
Tom de Vries 2c2ff1684d [nvptx] Enable setting vector length using -fopenacc-dim
Enable setting vector length using -fopenacc-dim, f.i. -fopenacc-dim=::128.

2019-01-12  Tom de Vries  <tdevries@suse.de>

	* config/nvptx/nvptx.c (nvptx_goacc_validate_dims_1): Alow setting
	vector length using -fopenacc-dim.

	* plugin/plugin-nvptx.c (nvptx_exec): Update error message.

From-SVN: r267896
2019-01-12 22:19:15 +00:00
Tom de Vries 52d22ece49 [nvptx] Update insufficient launch message for variable vector_length
Update message in nvptx libgomp plugin about insufficient resources to launch
kernel, to accommodate for the fact the vector_length can now be variable.

2019-01-12  Tom de Vries  <tdevries@suse.de>

	* plugin/plugin-nvptx.c (nvptx_exec): Update insufficient hardware
	resources diagnostic.

From-SVN: r267890
2019-01-12 22:18:00 +00:00
Tom de Vries 052aaaceed [nvptx] Don't allow vector_length 64 with num_workers 16
When using a compiler build with:
...
+#define PTX_DEFAULT_VECTOR_LENGTH PTX_CTA_SIZE
...

consider a test-case:
...
int
main (void)
{
  #pragma acc parallel vector_length (64)
    #pragma acc loop worker
    for (unsigned int i = 0; i < 32; i++)
      #pragma acc loop vector
      for (unsigned int j = 0; j < 64; j++)
        ;

  return 0;
}
...

If num_workers is 16, either because:
- we add a "num_workers (16)" clause on the parallel directive, or
- we set "GOMP_OPENACC_DIM=:16:", or
- the libgomp plugin chooses 16 num_workers
we run into an illegal instruction at runtime, because a bar.sync instruction
tries to use a barrier 16.  The instruction is illegal, because ptx supports
only 16 barriers per CTA, and the valid range is 0..15.

The problem is that with a warp-multiple vector length, we use a code generation
scheme with a per-worker barrier.  And because barrier zero is reserved for
per-cta barrier, only the remaining 15 barriers can be used as per-worker
barrier, and consequently we can't use num_workers larger than 15.

This problem occurs only for vector_length 64.  For vector_length 32, we use a
different code generation scheme, and for vector_length >= 96, the maximum
num_workers is not big enough not to trigger this problem.

Also, this problem only occurs for num_workers 16.  As explained above,
num_workers 15 is safe to use, and 16 is already the maximum num_workers for
vector_length 64.

This patch fixes the problem in both the compiler (handling "num_workers (16)")
and in the libgomp nvptx plugin (with and without "GOMP_OPENACC_DIM=:16:").

2019-01-11  Tom de Vries  <tdevries@suse.de>

	* config/nvptx/nvptx.c (PTX_CTA_NUM_BARRIERS, PTX_PER_CTA_BARRIER)
	(PTX_NUM_PER_CTA_BARRIER, PTX_FIRST_PER_WORKER_BARRIER)
	(PTX_NUM_PER_WORKER_BARRIERS): Define.
	(nvptx_apply_dim_limits): Prevent vector_length 64 and
	num_workers 16.

	* plugin/plugin-nvptx.c (nvptx_exec): Prevent vector_length 64 and
	num_workers 16.

From-SVN: r267838
2019-01-11 11:46:43 +00:00
Tom de Vries 2c372e81a9 [nvptx, libgomp] Don't launch with num_workers == 0
When using a compiler build with:
...
+#define PTX_DEFAULT_VECTOR_LENGTH PTX_CTA_SIZE
+#define PTX_MAX_VECTOR_LENGTH PTX_CTA_SIZE
...
and running the libgomp testsuite, we run into an execution failure in
parallel-loop-1.c, due to a cuda launch failure:
...
  nvptx_exec: kernel f6_none_none$_omp_fn$0: launch gangs=480, workers=0, \
    vectors=1024

libgomp: cuLaunchKernel error: invalid argument
...
because workers == 0.

The workers variable is set to 0 here in nvptx_exec:
...
                workers = blocks / actual_vectors;
...
because actual_vectors is 1024, and blocks is 768:
...
cuOccupancyMaxPotentialBlockSize: grid = 10, block = 768
...

Fix this by ensuring that workers is at least one.

2019-01-09  Tom de Vries  <tdevries@suse.de>

	* plugin/plugin-nvptx.c (nvptx_exec): Make sure to launch with at least
	one worker.

From-SVN: r267746
2019-01-09 00:07:45 +00:00
Jakub Jelinek a554497024 Update copyright years.
From-SVN: r267494
2019-01-01 13:31:55 +01:00
Thomas Schwinge f847198ec3 [PR88495] An OpenACC async queue is always synchronized with itself
An OpenACC async queue is always synchronized with itself, so invocations like
"#pragma acc wait(0) async(0)", or "acc_wait_async (0, 0)" don't make a lot of
sense, but are still valid.

	libgomp/
	PR libgomp/88495
	* plugin/plugin-nvptx.c (nvptx_wait_async): Don't refuse
	"identical parameters".
	* testsuite/libgomp.oacc-c-c++-common/asyncwait-nop-1.c: Update.
	* testsuite/libgomp.oacc-c-c++-common/lib-80.c: Remove.

From-SVN: r267152
2018-12-14 21:43:02 +01:00
Thomas Schwinge 1404af62dc [PR88407] [OpenACC] Correctly handle unseen async-arguments
... which turn the operation into a no-op.

	libgomp/
	PR libgomp/88407
	* plugin/plugin-nvptx.c (nvptx_async_test, nvptx_wait)
	(nvptx_wait_async): Unseen async-argument is a no-op.
	* testsuite/libgomp.oacc-c-c++-common/async_queue-1.c: Update.
	* testsuite/libgomp.oacc-c-c++-common/data-2-lib.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-79.c: Likewise.
	* testsuite/libgomp.oacc-fortran/lib-12.f90: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-71.c: Merge into...
	* testsuite/libgomp.oacc-c-c++-common/lib-69.c: ... this.  Update.
	* testsuite/libgomp.oacc-c-c++-common/lib-77.c: Merge into...
	* testsuite/libgomp.oacc-c-c++-common/lib-74.c: ... this.  Update

From-SVN: r267150
2018-12-14 21:42:40 +01:00
Thomas Schwinge 18c247cc0b [PR88370] acc_get_cuda_stream/acc_set_cuda_stream: acc_async_sync, acc_async_noval
Per my reading of the OpenACC specification (and as supported by secondary
documentation, such as code examples, or presentations), it's valid to call
"acc_get_cuda_stream"/"acc_set_cuda_stream" also with "acc_async_sync",
"acc_async_noval" arguments, not just with the nonnegative values as currently
implemented.

	libgomp/
	PR libgomp/88370
	* libgomp.texi (acc_get_current_cuda_context, acc_get_cuda_stream)
	(acc_set_cuda_stream): Clarify.
	* oacc-cuda.c (acc_get_cuda_stream, acc_set_cuda_stream): Use
	"async_valid_p".
	* plugin/plugin-nvptx.c (nvptx_set_cuda_stream): Refuse "async ==
	acc_async_sync".
	* testsuite/libgomp.oacc-c-c++-common/acc_set_cuda_stream-1.c: New file.
	* testsuite/libgomp.oacc-c-c++-common/async_queue-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-84.c: Update.
	* testsuite/libgomp.oacc-c-c++-common/lib-85.c: Likewise.

From-SVN: r267147
2018-12-14 21:42:08 +01:00
Cesar Philippidis 2049befdd0 [nvptx] Remove use of CUDA unified memory in libgomp
libgomp/
	* plugin/plugin-nvptx.c (struct cuda_map): New.
	(struct ptx_stream): Replace d, h, h_begin, h_end, h_next, h_prev,
	h_tail with (cuda_map *) map.
	(cuda_map_create): New function.
	(cuda_map_destroy): New function.
	(map_init): Update to use a linked list of cuda_map objects.
	(map_fini): Likewise.
	(map_pop): Likewise.
	(map_push): Likewise.  Return CUdeviceptr instead of void.
	(init_streams_for_device): Remove stales references to ptx_stream
	members.
	(select_stream_for_async): Likewise.
	(nvptx_exec): Update call to map_init.

From-SVN: r264397
2018-09-18 08:41:54 -07:00
Cesar Philippidis bd9b3d3d1a [nvptx] Use CUDA driver API to select default runtime launch geometry
The CUDA driver API starting version 6.5 offers a set of runtime functions to
calculate several occupancy-related measures, as a replacement for the occupancy
calculator spreadsheet.

This patch adds a heuristic for default runtime launch geometry, based on the
new runtime function cuOccupancyMaxPotentialBlockSize.

Build on x86_64 with nvptx accelerator and ran libgomp testsuite.

2018-08-13  Cesar Philippidis  <cesar@codesourcery.com>
	    Tom de Vries  <tdevries@suse.de>

	PR target/85590
	* plugin/cuda/cuda.h (CUoccupancyB2DSize): New typedef.
	(cuOccupancyMaxPotentialBlockSize): Declare.
	* plugin/cuda-lib.def (cuOccupancyMaxPotentialBlockSize): New
	CUDA_ONE_CALL_MAYBE_NULL.
	* plugin/plugin-nvptx.c (CUDA_VERSION < 6050): Define
	CUoccupancyB2DSize and declare
	cuOccupancyMaxPotentialBlockSize.
	(nvptx_exec): Use cuOccupancyMaxPotentialBlockSize to set the
	default num_gangs and num_workers when the driver supports it.

Co-Authored-By: Tom de Vries <tdevries@suse.de>

From-SVN: r263505
2018-08-13 12:04:24 +00:00
Tom de Vries 8e09a12f01 [libgomp, nvptx] Fall back to cuLinkAddData/cuLinkCreate if _v2 not found
Cuda driver api functions cuLinkAddData and cuLinkCreate are available starting
version 5.5.  In version 6.5, they are remapped onto _v2 versions.

The dlopen interface of the libgomp nvptx plugin uses the _v2 versions, so it
won't work with a cuda driver with driver api version lower than 6.5.

This patch fixes the problem by testing for the presence of the _v2 versions,
and falling back to the original versions in case of absence of the _v2
versions.

Build on x86_64 with nvptx accelerator and reg-tested libgomp, both with and
without --without-cuda-driver.

2018-08-08  Tom de Vries  <tdevries@suse.de>

	* plugin/cuda-lib.def (cuLinkAddData_v2, cuLinkCreate_v2): Declare using
	CUDA_ONE_CALL_MAYBE_NULL.
	* plugin/plugin-nvptx.c (cuLinkAddData, cuLinkCreate): Undef and declare.
	(cuLinkAddData_v2, cuLinkCreate_v2): Declare.
	(link_ptx): Fall back to cuLinkAddData/cuLinkCreate if the _v2 versions
	are not found.

From-SVN: r263408
2018-08-08 14:26:37 +00:00
Tom de Vries cedd9bd016 [libgomp, nvptx] Allow cuGetErrorString to be NULL
Cuda driver api function cuGetErrorString is available in version 6.0 and
higher.

Currently, when the driver that is used does not contain this function, the
libgomp nvptx plugin will not build (PLUGIN_NVPTX_DYNAMIC == 0) or run
(PLUGIN_NVPTX_DYNAMIC == 1).

This patch fixes this problem by testing for the presence of the function, and
handling absence.

Build on x86_64 with nvptx accelerator and reg-tested libgomp, both with and
without --without-cuda-driver.

2018-08-08  Tom de Vries  <tdevries@suse.de>

	* plugin/cuda-lib.def (cuGetErrorString): Use CUDA_ONE_CALL_MAYBE_NULL.
	* plugin/plugin-nvptx.c (cuda_error): Handle if cuGetErrorString is not
	present.

From-SVN: r263407
2018-08-08 14:26:28 +00:00
Tom de Vries b113af959c [libgomp, nvptx] Remove hard-coded const in nvptx_open_device
CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR is defined in cuda driver
api version 6.0 and higher.

Currently nvptx_open_device uses a hard-coded constant instead.

This patch fixes that by:
- defining CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR to the hardcoded
  constant at toplevel, if not present in cuda.h, and
- using CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR in nvptx_open_device

Build on x86_64 with nvptx accelerator and reg-tested libgomp.

2018-08-08  Tom de Vries  <tdevries@suse.de>

	* plugin/plugin-nvptx.c
	(CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR): Define.
	(nvptx_open_device): Use
	CU_DEVICE_ATTRIBUTE_MAX_REGISTERS_PER_MULTIPROCESSOR.

From-SVN: r263406
2018-08-08 14:26:19 +00:00
Tom de Vries 94767dacea [libgomp, nvptx] Note that cuGetErrorString is in CUDA_VERSION >= 6000
Cuda driver api function cuGetErrorString is available in version 6.0 and
higher.

This patch:
- removes a comment saying the declaration is not available in cuda.h 6.0
- fixes the presence test to use CUDA_VERSION < 6000
- moves the declaration to toplevel

Build on x86_64 with nvptx accelerator and reg-tested libgomp.

2018-08-08  Tom de Vries  <tdevries@suse.de>

	* plugin/plugin-nvptx.c (cuda_error): Move declaration of cuGetErrorString ...
	(cuGetErrorString): ... here.  Guard with CUDA_VERSION < 6000.

From-SVN: r263405
2018-08-08 14:26:10 +00:00
Tom de Vries 02150de863 [libgomp, nvptx] Handle CUDA_ONE_CALL_MAYBE_NULL
This patch adds handling of functions that may not be present in the cuda
driver.

Such a function can be declared using CUDA_ONE_CALL_MAYBE_NULL in cuda-lib.def,
it can be called with the usual convenience macros, but before calling its
presence needs to be tested using new macro CUDA_CALL_EXISTS.

When using the dlopen interface (PLUGIN_NVPTX_DYNAMIC == 1), we allow
non-present functions by allowing dlsym to return NULL.  Otherwise
(PLUGIN_NVPTX_DYNAMIC == 0) we declare the non-present function to be weak.

Build and reg-tested libgomp on x86_64 with nvidia accelerator, with and without
--disable-cuda-driver, in combination with a trigger patch that adds a
non-existing function foo to cuda-lib.def:
...
CUDA_ONE_CALL_MAYBE_NULL (foo)
...
and declares it in plugin-nvptx.c:
...
CUresult foo (void);
...
and then uses it in nvptx_init after the init_cuda_lib call:
...
  if (CUDA_CALL_EXISTS (foo))
    CUDA_CALL (foo);
...

Also build and reg-tested on x86_64 with nvidia accelerator, with and without
--disable-cuda-driver, in combination with a trigger patch that replaces all
CUDA_ONE_CALLs in cuda-lib.def with CUDA_ONE_CALL_MAYBE_NULL, and guards two
CUDA_CALLs with CUDA_CALL_EXISTS, one for a regular fn, and one for a fn that is
a define in cuda/cuda.h.

2018-08-07  Tom de Vries  <tdevries@suse.de>

	* plugin/plugin-nvptx.c (DO_PRAGMA): Define.
	(struct cuda_lib_s): Add def/undef of CUDA_ONE_CALL_MAYBE_NULL.
	(init_cuda_lib): Add new param to CUDA_ONE_CALL_1.  Add arg to
	corresponding call in CUDA_ONE_CALL.  Add def/undef of
	CUDA_ONE_CALL_MAYBE_NULL.
	(CUDA_CALL_EXISTS): Define.

From-SVN: r263346
2018-08-06 22:13:56 +00:00
Tom de Vries 9e28b10779 [libgomp, nvptx] Minimize lifetime of CUDA_ONE_CALL defines
This patch makes sure that the lifetimes of the CUDA_ONE_CALL macro (which is
defined twice in plugin-nvptx.c) are minimized, to make it obvious that the
definitions are used only in the lib-cuda.def include.

Build on x86_64 with nvptx accelerator and reg-tested libgomp.

2018-08-07  Tom de Vries  <tdevries@suse.de>

	* plugin/plugin-nvptx.c (struct cuda_lib_s, init_cuda_lib): Put
	CUDA_ONE_CALL defines right before the cuda-lib.def include, and the
	corresponding undefs right after.

From-SVN: r263345
2018-08-06 22:13:46 +00:00
Tom de Vries 099400909e [libgomp, nvptx, --without-cuda-driver] Don't use system cuda driver
Using libgomp configure option --with-cuda-driver=<dir> we can indicate what
cuda driver to use to build the libgomp nvptx plugin.  Without such an option,
the system cuda driver is used, if available.  If not availabe, a dlopen
interface is used instead.

However, when we use --without-cuda-driver (or the equivalent
--with-cuda-driver=no) the system cuda driver is still used if available.

This patch fixes that, making sure that --without-cuda-driver selects the dlopen
interface.

Build on x86_64 with nvptx accelerator and tested libgomp testsuite, with and
without option --without-cuda-driver.

2018-08-04  Tom de Vries  <tdevries@suse.de>

	* plugin/configfrag.ac: For --without-cuda-driver, set
	CUDA_DRIVER_INCLUDE and CUDA_DRIVER_LIB to no.  Handle
	CUDA_DRIVER_INCLUDE == no and CUDA_DRIVER_LIB == no.
	* configure: Regenerate.

From-SVN: r263310
2018-08-04 20:07:22 +00:00
Cesar Philippidis 094db6beb9 [PATCH] Remove use of 'struct map' from plugin (nvptx)
libgomp/
	* plugin/plugin-nvptx.c (struct map): Removed.
	(map_init, map_pop): Remove use of struct map. (map_push):
	Likewise and change argument list.
	* testsuite/libgomp.oacc-c-c++-common/mapping-1.c: New

Co-Authored-By: James Norris <jnorris@codesourcery.com>

From-SVN: r263212
2018-08-01 07:09:56 -07:00
Tom de Vries 8c6310a2c2 [libgomp, nvptx] Add cuda-lib.def
2018-08-01  Tom de Vries  <tdevries@suse.de>

	* plugin/cuda-lib.def: New file.  Factor out of ...
	* plugin/plugin-nvptx.c (CUDA_CALLS): ... here.
	(struct cuda_lib_s, init_cuda_lib): Include cuda-lib.def instead of
	using CUDA_CALLS.

From-SVN: r263208
2018-08-01 13:20:22 +00:00
Tom de Vries 4cdfee3f20 [libgomp, nvptx] Handle per-function max-threads-per-block in default dims
Currently parallel-loop-1.c fails at -O0 on a Quadro M1200, because one of the
kernel launch configurations exceeds the resources available in the device, due
to the default dimensions chosen by the runtime.

This patch fixes that by taking the per-function max_threads_per_block into
account when using the default dimensions.

2018-07-30  Tom de Vries  <tdevries@suse.de>

	* plugin/plugin-nvptx.c (MIN, MAX): Redefine.
	(nvptx_exec): Ensure worker and vector default dims don't exceed
	targ_fn->max_threads_per_block.

From-SVN: r263062
2018-07-30 08:17:26 +00:00
Tom de Vries 0b210c43bb [libgomp, nvptx] Calculate default dims per device
The default dimensions are calculated using per-device properties, but
initialized once and used on all devices.

This patch fixes this problem by introducing per-device default dimensions.

2018-07-30  Tom de Vries  <tdevries@suse.de>

	* plugin/plugin-nvptx.c (struct ptx_device): Add default_dims field.
	(nvptx_open_device): Init default_dims for device.
	(nvptx_exec): Use default_dims from device.

From-SVN: r263061
2018-07-30 08:17:16 +00:00
Cesar Philippidis 88a4654d03 [libgomp, nvptx] Add error with recompilation hint for launch failure
Currently, when a kernel is lauched with too many workers, it results in a cuda
launch failure.  This is triggered f.i. for parallel-loop-1.c at -O0 on a Quadro
M1200.

This patch detects this situation, and errors out with a hint on how to fix it.

Build and reg-tested on x86_64 with nvptx accelerator.

2018-07-26  Cesar Philippidis  <cesar@codesourcery.com>
	    Tom de Vries  <tdevries@suse.de>

	* plugin/plugin-nvptx.c (nvptx_exec): Error if the hardware doesn't have
	sufficient resources to launch a kernel, and give a hint on how to fix
	it.

Co-Authored-By: Tom de Vries <tdevries@suse.de>

From-SVN: r262997
2018-07-26 11:42:29 +00:00
Cesar Philippidis 0c6c2f5fc2 [libgomp, nvptx] Move device property sampling from nvptx_exec to nvptx_open
Move sampling of device properties from nvptx_exec to nvptx_open, and assume
the sampling always succeeds.  This simplifies the default dimension
initialization code in nvptx_open.

2018-07-26  Cesar Philippidis  <cesar@codesourcery.com>
	    Tom de Vries  <tdevries@suse.de>

	* plugin/plugin-nvptx.c (struct ptx_device): Add warp_size,
	max_threads_per_block and max_threads_per_multiprocessor fields.
	(nvptx_open_device): Initialize new fields.
	(nvptx_exec): Use num_sms, and new fields.

Co-Authored-By: Tom de Vries <tdevries@suse.de>

From-SVN: r262996
2018-07-26 11:42:19 +00:00
Tom de Vries ec00d3faf4 [openacc] Move GOMP_OPENACC_DIM parsing out of nvptx plugin
2018-05-02  Tom de Vries  <tom@codesourcery.com>

	PR libgomp/85411
	* plugin/plugin-nvptx.c (nvptx_exec): Move parsing of
	GOMP_OPENACC_DIM ...
	* env.c (parse_gomp_openacc_dim): ... here.  New function.
	(initialize_env): Call parse_gomp_openacc_dim.
	(goacc_default_dims): Define.
	* libgomp.h (goacc_default_dims): Declare.
	* oacc-plugin.c (GOMP_PLUGIN_acc_default_dim): New function.
	* oacc-plugin.h (GOMP_PLUGIN_acc_default_dim): Declare.
	* libgomp.map: New version "GOMP_PLUGIN_1.2". Add
	GOMP_PLUGIN_acc_default_dim.
	* testsuite/libgomp.oacc-c-c++-common/loop-default-runtime.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/loop-default.h: New test.

From-SVN: r259852
2018-05-02 17:53:56 +00:00
Tom de Vries df36a3d3be [nvptx, libgomp] Add GOMP_NVPTX_JIT=-O[0-4] in nvptx libgomp plugin
2018-04-26  Tom de Vries  <tom@codesourcery.com>

	PR libgomp/84020
	* plugin/cuda/cuda.h (CUjit_option): Add CU_JIT_OPTIMIZATION_LEVEL.
	* plugin/plugin-nvptx.c (_GNU_SOURCE): Define.
	(process_GOMP_NVPTX_JIT): New function.
	(link_ptx): Use process_GOMP_NVPTX_JIT.

From-SVN: r259678
2018-04-26 13:27:04 +00:00
Jakub Jelinek 85ec4feb11 Update copyright years.
From-SVN: r256169
2018-01-03 11:03:58 +01:00
Tom de Vries 12e9c8ce6c Remove semicolon after do {} while (false) in HSA_LOG
2017-10-31  Tom de Vries  <tom@codesourcery.com>

	* plugin/plugin-hsa.c (HSA_LOG): Remove semicolon after
	"do {} while (false)".
	(init_single_kernel, GOMP_OFFLOAD_async_run): Add missing semicolon
	after HSA_DEBUG call.

From-SVN: r254264
2017-10-31 12:56:41 +00:00
Tom de Vries 9607b014b2 Fix secure_getenv.h include in plugin-hsa.c
2017-07-03  Tom de Vries  <tom@codesourcery.com>

	* plugin/plugin-hsa.c: Fix secure_getenv.h include.

From-SVN: r249918
2017-07-03 13:40:19 +00:00
Tom de Vries dfb15f6bbb Show value of GOMP_OPENACC_DIM in libgomp nvptx plugin
2017-06-27  Tom de Vries  <tom@codesourcery.com>

	* plugin/plugin-nvptx.c (notify_var): New function.
	(nvptx_exec): Use notify_var for GOMP_OPENACC_DIM.

From-SVN: r249695
2017-06-27 15:51:48 +00:00
Tom de Vries 22f1a03704 Use secure_getenv for GOMP_DEBUG
2017-06-27  Tom de Vries  <tom@codesourcery.com>

	* env.c (parse_unsigned_long_1): Factor out of ...
	(parse_unsigned_long): ... here.
	(parse_int_1): Factor out of ...
	(parse_int): ... here.
	(parse_int_secure): New function.
	(initialize_env): Use parse_int_secure for GOMP_DEBUG.
	* secure_getenv.h: Factor out of ...
	* plugin/plugin-hsa.c: ... here.
	* testsuite/libgomp.oacc-c-c++-common/gomp-debug-env.c: New test.

From-SVN: r249694
2017-06-27 15:51:37 +00:00
Thomas Schwinge 78672bd8fd libgomp nvptx plugin: Debugging output when disabling nvptx offloading
libgomp/
	* plugin/plugin-nvptx.c (nvptx_get_num_devices): Debugging output
	when disabling nvptx offloading.

From-SVN: r248400
2017-05-24 08:59:05 +02:00
Thomas Schwinge 0da2f96af0 libgomp hsa plugin: debug output for HSA runtime library loading failure
libgomp/
	* plugin/plugin-hsa.c (DLSYM_FN, init_hsa_runtime_functions):
	Debug output for failure.

From-SVN: r248277
2017-05-19 15:32:04 +02:00
Jakub Jelinek 19929ba9c9 plugin-nvptx.c (cuda_lib_inited): Use signed char type instead of char.
* plugin/plugin-nvptx.c (cuda_lib_inited): Use signed char type
	instead of char.

From-SVN: r246918
2017-04-13 21:59:04 +02:00
Thomas Schwinge e70ab10d5c libgomp, nvptx plugin: Make "nvptx_exec" static
libgomp/
	* plugin/plugin-nvptx.c (nvptx_exec): Make it static.

From-SVN: r245127
2017-02-02 15:35:30 +01:00
Thomas Schwinge 345a8c1712 libgomp: Normalize the names of a few functions of the libgomp plugin API
libgomp/
	* libgomp-plugin.h (GOMP_OFFLOAD_openacc_parallel): Rename to
	GOMP_OFFLOAD_openacc_exec.  Adjust all users.
	(GOMP_OFFLOAD_openacc_get_current_cuda_device): Rename to
	GOMP_OFFLOAD_openacc_cuda_get_current_device.  Adjust all users.
	(GOMP_OFFLOAD_openacc_get_current_cuda_context): Rename to
	GOMP_OFFLOAD_openacc_cuda_get_current_context.  Adjust all users.
	(GOMP_OFFLOAD_openacc_get_cuda_stream): Rename to
	GOMP_OFFLOAD_openacc_cuda_get_stream.  Adjust all users.
	(GOMP_OFFLOAD_openacc_set_cuda_stream): Rename to
	GOMP_OFFLOAD_openacc_cuda_set_stream.  Adjust all users.

From-SVN: r245125
2017-02-02 15:13:57 +01:00
Thomas Schwinge dced339c8a libgomp: Provide prototypes for functions implemented by libgomp plugins
libgomp/
	* libgomp-plugin.h: #include <stdbool.h>.
	(GOMP_OFFLOAD_get_name, GOMP_OFFLOAD_get_caps)
	(GOMP_OFFLOAD_get_type, GOMP_OFFLOAD_get_num_devices)
	(GOMP_OFFLOAD_init_device, GOMP_OFFLOAD_fini_device)
	(GOMP_OFFLOAD_version, GOMP_OFFLOAD_load_image)
	(GOMP_OFFLOAD_unload_image, GOMP_OFFLOAD_alloc, GOMP_OFFLOAD_free)
	(GOMP_OFFLOAD_dev2host, GOMP_OFFLOAD_host2dev)
	(GOMP_OFFLOAD_dev2dev, GOMP_OFFLOAD_can_run, GOMP_OFFLOAD_run)
	(GOMP_OFFLOAD_async_run, GOMP_OFFLOAD_openacc_parallel)
	(GOMP_OFFLOAD_openacc_register_async_cleanup)
	(GOMP_OFFLOAD_openacc_async_test)
	(GOMP_OFFLOAD_openacc_async_test_all)
	(GOMP_OFFLOAD_openacc_async_wait)
	(GOMP_OFFLOAD_openacc_async_wait_async)
	(GOMP_OFFLOAD_openacc_async_wait_all)
	(GOMP_OFFLOAD_openacc_async_wait_all_async)
	(GOMP_OFFLOAD_openacc_async_set_async)
	(GOMP_OFFLOAD_openacc_create_thread_data)
	(GOMP_OFFLOAD_openacc_destroy_thread_data)
	(GOMP_OFFLOAD_openacc_get_current_cuda_device)
	(GOMP_OFFLOAD_openacc_get_current_cuda_context)
	(GOMP_OFFLOAD_openacc_get_cuda_stream)
	(GOMP_OFFLOAD_openacc_set_cuda_stream): New prototypes.
	* libgomp.h (struct acc_dispatch_t, struct gomp_device_descr): Use
	these.
	* plugin/plugin-hsa.c (GOMP_OFFLOAD_load_image)
	(GOMP_OFFLOAD_unload_image): Fix argument types.
	liboffloadmic/
	* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_get_type): Fix
	return type.
	(GOMP_OFFLOAD_load_image): Fix argument types.

From-SVN: r245062
2017-01-31 15:32:58 +01:00
Pekka Jääskeläinen 5fd1486ce5 Brig front-end
2017-01-24  Pekka Jääskeläinen <pekka@parmance.com>
	    Martin Jambor  <mjambor@suse.cz>

	* Makefile.def (target_modules): Added libhsail-rt.
	(languages): Added language brig.
	* Makefile.in: Regenerated.
	* configure.ac (TOPLEVEL_CONFIGURE_ARGUMENTS): Added
	tgarget-libhsail-rt.  Make brig unsupported on untested architectures.
	* configure: Regenerated.

gcc/
	* brig-builtins.def: New file.
	* builtins.def (DEF_HSAIL_BUILTIN): New macro.
	(DEF_HSAIL_ATOMIC_BUILTIN): Likewise.
	(DEF_HSAIL_SAT_BUILTIN): Likewise.
	(DEF_HSAIL_INTR_BUILTIN): Likewise.
	(DEF_HSAIL_CVT_ZEROI_SAT_BUILTIN): Likewise.
	* builtin-types.def (BT_INT8): New.
	(BT_INT16): Likewise.
	(BT_UINT8): Likewise.
	(BT_UINT16): Likewise.
	(BT_FN_ULONG): Likewise.
	(BT_FN_UINT_INT): Likewise.
	(BT_FN_UINT_ULONG): Likewise.
	(BT_FN_UINT_LONG): Likewise.
	(BT_FN_UINT_PTR): Likewise.
	(BT_FN_ULONG_PTR): Likewise.
	(BT_FN_INT8_FLOAT): Likewise.
	(BT_FN_INT16_FLOAT): Likewise.
	(BT_FN_UINT32_FLOAT): Likewise.
	(BT_FN_UINT16_FLOAT): Likewise.
	(BT_FN_UINT8_FLOAT): Likewise.
	(BT_FN_UINT64_FLOAT): Likewise.
	(BT_FN_UINT16_UINT32): Likewise.
	(BT_FN_UINT32_UINT16): Likewise.
	(BT_FN_UINT16_UINT16_UINT16): Likewise.
	(BT_FN_INT_PTR_INT): Likewise.
	(BT_FN_UINT_PTR_UINT): Likewise.
	(BT_FN_LONG_PTR_LONG): Likewise.
	(BT_FN_ULONG_PTR_ULONG): Likewise.
	(BT_FN_VOID_UINT64_UINT64): Likewise.
	(BT_FN_UINT8_UINT8_UINT8): Likewise.
	(BT_FN_INT8_INT8_INT8): Likewise.
	(BT_FN_INT16_INT16_INT16): Likewise.
	(BT_FN_INT_INT_INT): Likewise.
	(BT_FN_UINT_FLOAT_UINT): Likewise.
	(BT_FN_FLOAT_UINT_UINT): Likewise.
	(BT_FN_ULONG_UINT_UINT): Likewise.
	(BT_FN_ULONG_UINT_PTR): Likewise.
	(BT_FN_ULONG_ULONG_ULONG): Likewise.
	(BT_FN_UINT_UINT_UINT): Likewise.
	(BT_FN_VOID_UINT_PTR): Likewise.
	(BT_FN_UINT_UINT_PTR: Likewise.
	(BT_FN_UINT32_UINT64_PTR): Likewise.
	(BT_FN_INT_INT_UINT_UINT): Likewise.
	(BT_FN_UINT_UINT_UINT_UINT): Likewise.
	(BT_FN_UINT_UINT_UINT_PTR): Likewise.
	(BT_FN_UINT_ULONG_ULONG_UINT): Likewise.
	(BT_FN_ULONG_ULONG_ULONG_ULONG): Likewise.
	(BT_FN_LONG_LONG_UINT_UINT): Likewise.
	(BT_FN_ULONG_ULONG_UINT_UINT): Likewise.
	(BT_FN_VOID_UINT32_UINT64_PTR): Likewise.
	(BT_FN_VOID_UINT32_UINT32_PTR): Likewise.
	(BT_FN_UINT_UINT_UINT_UINT_UINT): Likewise.
	(BT_FN_UINT_FLOAT_FLOAT_FLOAT_FLOAT): Likewise.
	(BT_FN_ULONG_ULONG_ULONG_UINT_UINT): Likewise.
	* doc/frontends.texi: List BRIG FE.
	* doc/install.texi (Testing): Add BRIG tesring requirements.
	* doc/invoke.texi (Overall Options): Mention BRIG.
	* doc/standards.texi (Standards): Doucment BRIG HSA version.

gcc/brig/

	* Make-lang.in: New file.
	* brig-builtins.h: Likewise.
	* brig-c.h: Likewise.
	* brig-lang.c: Likewise.
	* brigspec.c: Likewise.
	* config-lang.in: Likewise.
	* lang-specs.h: Likewise.
	* lang.opt: Likewise.
	* brigfrontend/brig-arg-block-handler.cc: Likewise.
	* brigfrontend/brig-atomic-inst-handler.cc: Likewise.
	* brigfrontend/brig-basic-inst-handler.cc: Likewise.
	* brigfrontend/brig-branch-inst-handler.cc: Likewise.
	* brigfrontend/brig-cmp-inst-handler.cc: Likewise.
	* brigfrontend/brig-code-entry-handler.cc: Likewise.
	* brigfrontend/brig-code-entry-handler.h: Likewise.
	* brigfrontend/brig-comment-handler.cc: Likewise.
	* brigfrontend/brig-control-handler.cc: Likewise.
	* brigfrontend/brig-copy-move-inst-handler.cc: Likewise.
	* brigfrontend/brig-cvt-inst-handler.cc: Likewise.
	* brigfrontend/brig-fbarrier-handler.cc: Likewise.
	* brigfrontend/brig-function-handler.cc: Likewise.
	* brigfrontend/brig-function.cc: Likewise.
	* brigfrontend/brig-function.h: Likewise.
	* brigfrontend/brig-inst-mod-handler.cc: Likewise.
	* brigfrontend/brig-label-handler.cc: Likewise.
	* brigfrontend/brig-lane-inst-handler.cc: Likewise.
	* brigfrontend/brig-machine.c: Likewise.
	* brigfrontend/brig-machine.h: Likewise.
	* brigfrontend/brig-mem-inst-handler.cc: Likewise.
	* brigfrontend/brig-module-handler.cc: Likewise.
	* brigfrontend/brig-queue-inst-handler.cc: Likewise.
	* brigfrontend/brig-seg-inst-handler.cc: Likewise.
	* brigfrontend/brig-signal-inst-handler.cc: Likewise.
	* brigfrontend/brig-to-generic.cc: Likewise.
	* brigfrontend/brig-to-generic.h: Likewise.
	* brigfrontend/brig-util.cc: Likewise.
	* brigfrontend/brig-util.h: Likewise.
	* brigfrontend/brig-variable-handler.cc: Likewise.
	* brigfrontend/phsa.h: Likewise.


gcc/testsuite/

	* lib/brig-dg.exp: New file.
	* lib/brig.exp: Likewise.
	* brig.dg/README: Likewise.
	* brig.dg/dg.exp: Likewise.
	* brig.dg/test/gimple/alloca.hsail: Likewise.
	* brig.dg/test/gimple/atomics.hsail: Likewise.
	* brig.dg/test/gimple/branches.hsail: Likewise.
	* brig.dg/test/gimple/fbarrier.hsail: Likewise.
	* brig.dg/test/gimple/function_calls.hsail: Likewise.
	* brig.dg/test/gimple/kernarg.hsail: Likewise.
	* brig.dg/test/gimple/mem.hsail: Likewise.
	* brig.dg/test/gimple/mulhi.hsail: Likewise.
	* brig.dg/test/gimple/packed.hsail: Likewise.
	* brig.dg/test/gimple/smoke_test.hsail: Likewise.
	* brig.dg/test/gimple/variables.hsail: Likewise.
	* brig.dg/test/gimple/vector.hsail: Likewise.

include/

	* hsa.h: Moved here from libgomp/plugin/hsa.h.

libgomp/

	* plugin/hsa.h: Moved to top level include.
	* plugin/plugin-hsa.c: Chanfgd include of hsa.h accordingly.

libhsail-rt/

	* Makefile.am: New file.
	* target-config.h.in: Likewise.
	* configure.ac: Likewise.
	* configure: Likewise.
	* config.h.in: Likewise.
	* aclocal.m4: Likewise.
	* README: Likewise.
	* Makefile.in: Likewise.
	* include/internal/fibers.h: Likewise.
	* include/internal/phsa-queue-interface.h: Likewise.
	* include/internal/phsa-rt.h: Likewise.
	* include/internal/workitems.h: Likewise.
	* rt/arithmetic.c: Likewise.
	* rt/atomics.c: Likewise.
	* rt/bitstring.c: Likewise.
	* rt/fbarrier.c: Likewise.
	* rt/fibers.c: Likewise.
	* rt/fp16.c: Likewise.
	* rt/misc.c: Likewise.
	* rt/multimedia.c: Likewise.
	* rt/queue.c: Likewise.
	* rt/sat_arithmetic.c: Likewise.
	* rt/segment.c: Likewise.
	* rt/workitems.c: Likewise.


Co-Authored-By: Martin Jambor <mjambor@suse.cz>

From-SVN: r244867
2017-01-24 13:45:56 +01:00
Jakub Jelinek b32e85fa42 cuda.h (CUdeviceptr): Typedef to unsigned long long even for _WIN64.
* plugin/cuda/cuda.h (CUdeviceptr): Typedef to unsigned long long even
	for _WIN64.

From-SVN: r244638
2017-01-19 16:53:51 +01:00
Jakub Jelinek d190d5c093 hsa.h: Add GCC runtime library exception.
* plugin/hsa.h: Add GCC runtime library exception.
	* plugin/hsa_ext_finalize.h: Likewise.

From-SVN: r244523
2017-01-17 10:45:23 +01:00
Jakub Jelinek 2393d337e7 configfrag.ac: For --without-cuda-driver don't initialize CUDA_DRIVER_INCLUDE nor CUDA_DRIVER_LIB.
* plugin/configfrag.ac: For --without-cuda-driver don't initialize
	CUDA_DRIVER_INCLUDE nor CUDA_DRIVER_LIB.  If both
	CUDA_DRIVER_INCLUDE and CUDA_DRIVER_LIB are empty and linking small
	cuda program fails, define PLUGIN_NVPTX_DYNAMIC to 1 and use
	plugin/include/cuda as include dir and -ldl instead of -lcuda as
	library to link ptx plugin against.
	* plugin/plugin-nvptx.c: Include dlfcn.h if PLUGIN_NVPTX_DYNAMIC.
	(CUDA_CALLS): Define.
	(cuda_lib, cuda_lib_inited): New variables.
	(init_cuda_lib): New function.
	(CUDA_CALL_PREFIX): Define.
	(CUDA_CALL_ERET, CUDA_CALL_ASSERT): Use CUDA_CALL_PREFIX.
	(CUDA_CALL): Use FN instead of (FN).
	(CUDA_CALL_NOCHECK): Define.
	(cuda_error, fini_streams_for_device, select_stream_for_async,
	nvptx_attach_host_thread_to_device, nvptx_open_device, link_ptx,
	event_gc, nvptx_exec, nvptx_async_test, nvptx_async_test_all,
	nvptx_wait_all, nvptx_set_clocktick, GOMP_OFFLOAD_unload_image,
	nvptx_stacks_alloc, nvptx_stacks_free, GOMP_OFFLOAD_run): Use
	CUDA_CALL_NOCHECK.
	(nvptx_init): Call init_cuda_lib, if it fails, return false.  Use
	CUDA_CALL_NOCHECK.
	(nvptx_get_num_devices): Call init_cuda_lib, if it fails, return 0.
	Use CUDA_CALL_NOCHECK.
	* plugin/cuda/cuda.h: New file.
	* config.h.in: Regenerated.
	* configure: Regenerated.

From-SVN: r244522
2017-01-17 10:44:17 +01:00
Jakub Jelinek cbe34bb5ed Update copyright years.
From-SVN: r243994
2017-01-01 13:07:43 +01:00
Alexander Monakov 6103184e81 OpenMP offloading to NVPTX: libgomp changes
* Makefile.am (libgomp_la_SOURCES): Add atomic.c, icv.c, icv-device.c.
	* Makefile.in. Regenerate.
	* configure.ac [nvptx*-*-*] (libgomp_use_pthreads): Set and use it...
	(LIBGOMP_USE_PTHREADS): ...here; new define.
	* configure: Regenerate.
	* config.h.in: Likewise.
	* config/posix/affinity.c: Move to...
	* affinity.c: ...here (new file).  Guard use of Pthreads-specific
	interface by LIBGOMP_USE_PTHREADS. 
	* critical.c: Split out GOMP_atomic_{start,end} into...
	* atomic.c: ...here (new file).
	* env.c: Split out ICV definitions into...
	* icv.c: ...here (new file) and...
	* icv-device.c: ...here. New file.
	* config/linux/lock.c (gomp_init_lock_30): Move to generic lock.c.
	(gomp_destroy_lock_30): Ditto.
	(gomp_set_lock_30): Ditto.
	(gomp_unset_lock_30): Ditto.
	(gomp_test_lock_30): Ditto.
	(gomp_init_nest_lock_30): Ditto.
	(gomp_destroy_nest_lock_30): Ditto.
	(gomp_set_nest_lock_30): Ditto.
	(gomp_unset_nest_lock_30): Ditto.
	(gomp_test_nest_lock_30): Ditto.
	* lock.c: New.
	* config/nvptx/lock.c: New.
	* config/nvptx/bar.c: New.
	* config/nvptx/bar.h: New.
	* config/nvptx/doacross.h: New.
	* config/nvptx/error.c: New.
	* config/nvptx/icv-device.c: New.
	* config/nvptx/mutex.h: New.
	* config/nvptx/pool.h: New.
	* config/nvptx/proc.c: New.
	* config/nvptx/ptrlock.h: New.
	* config/nvptx/sem.h: New.
	* config/nvptx/simple-bar.h: New.
	* config/nvptx/target.c: New.
	* config/nvptx/task.c: New.
	* config/nvptx/team.c: New.
	* config/nvptx/time.c: New.
	* config/posix/simple-bar.h: New.
	* libgomp.h: Guard pthread.h inclusion.  Include simple-bar.h.
	(gomp_num_teams_var): Declare.
	(struct gomp_thread_pool): Change threads_dock member to
	gomp_simple_barrier_t.
	[__nvptx__] (gomp_thread): New implementation.
	(gomp_thread_attr): Guard by LIBGOMP_USE_PTHREADS.
	(gomp_thread_destructor): Ditto.
	(gomp_init_thread_affinity): Ditto.
	* team.c: Guard uses of Pthreads-specific interfaces by
	LIBGOMP_USE_PTHREADS.  Adjust all uses of threads_dock.
	(gomp_free_thread) [__nvptx__]: Do not call 'free'.

	* config/nvptx/alloc.c: Delete.
	* config/nvptx/barrier.c: Ditto.
	* config/nvptx/fortran.c: Ditto.
	* config/nvptx/iter.c: Ditto.
	* config/nvptx/iter_ull.c: Ditto.
	* config/nvptx/loop.c: Ditto.
	* config/nvptx/loop_ull.c: Ditto.
	* config/nvptx/ordered.c: Ditto.
	* config/nvptx/parallel.c: Ditto.
	* config/nvptx/priority_queue.c: Ditto.
	* config/nvptx/sections.c: Ditto.
	* config/nvptx/single.c: Ditto.
	* config/nvptx/splay-tree.c: Ditto.
	* config/nvptx/work.c: Ditto.

	* testsuite/libgomp.fortran/fortran.exp (lang_link_flags): Pass
	-foffload=-lgfortran in addition to -lgfortran.
	* testsuite/libgomp.oacc-fortran/fortran.exp (lang_link_flags): Ditto.

	* plugin/plugin-nvptx.c: Include <limits.h>.
	(struct targ_fn_descriptor): Add new fields.
	(struct ptx_device): Ditto.  Set them...
	(nvptx_open_device): ...here.
	(nvptx_adjust_launch_bounds): New.
	(nvptx_host2dev): Allow NULL 'nvthd'.
	(nvptx_dev2host): Ditto.
	(GOMP_OFFLOAD_get_caps): Add GOMP_OFFLOAD_CAP_OPENMP_400.
	(link_ptx): Adjust log sizes.
	(nvptx_host2dev): Allow NULL 'nvthd'.
	(nvptx_dev2host): Ditto.
	(nvptx_set_clocktick): New.  Use it...
	(GOMP_OFFLOAD_load_image): ...here.  Set new targ_fn_descriptor
	fields.
	(GOMP_OFFLOAD_dev2dev): New.
	(nvptx_adjust_launch_bounds): New.
	(nvptx_stacks_size): New.
	(nvptx_stacks_alloc): New.
	(nvptx_stacks_free): New.
	(GOMP_OFFLOAD_run): New.
	(GOMP_OFFLOAD_async_run): New (stub).

Co-Authored-By: Dmitry Melnik <dm@ispras.ru>
Co-Authored-By: Jakub Jelinek <jakub@redhat.com>

From-SVN: r242789
2016-11-23 21:36:41 +03:00
Martin Liska b8d89b03db Remove build dependence on HSA run-time
2016-11-23  Martin Liska  <mliska@suse.cz>
            Martin Jambor  <mjambor@suse.cz>

gcc/
	* doc/install.texi: Remove entry about --with-hsa-kmt-lib.

libgomp/
	* plugin/hsa.h: New file.
	* plugin/hsa_ext_finalize.h: New file.
	* plugin/configfrag.ac: Remove hsa-kmt-lib test.  Added checks for
	header file unistd.h, and functions secure_getenv, __secure_getenv,
	getuid, geteuid, getgid and getegid.
	* plugin/Makefrag.am (libgomp_plugin_hsa_la_CPPFLAGS): Added
	-D_GNU_SOURCE.
	* plugin/plugin-hsa.c: Include config.h, inttypes.h and stdbool.h.
	Handle various cases of secure_getenv presence, add an implementation
	when we can test effective UID and GID.
	(struct hsa_runtime_fn_info): New structure.
	(hsa_runtime_fn_info hsa_fns): New variable.
	(hsa_runtime_lib): Likewise.
	(support_cpu_devices): Likewise.
	(init_enviroment_variables): Load newly introduced ENV
	variables.
	(hsa_warn): Call hsa run-time functions via hsa_fns structure.
	(hsa_fatal): Likewise.
	(DLSYM_FN): New macro.
	(init_hsa_runtime_functions): New function.
	(suitable_hsa_agent_p): Call hsa run-time functions via hsa_fns
	structure.  Depending on environment, also allow CPU devices.
	(init_hsa_context): Call hsa run-time functions via hsa_fns structure.
	(get_kernarg_memory_region): Likewise.
	(GOMP_OFFLOAD_init_device): Likewise.
	(destroy_hsa_program): Likewise.
	(init_basic_kernel_info): New function.
	(GOMP_OFFLOAD_load_image): Use it.
	(create_and_finalize_hsa_program): Call hsa run-time functions via
	hsa_fns structure.
	(create_single_kernel_dispatch): Likewise.
	(release_kernel_dispatch): Likewise.
	(init_single_kernel): Likewise.
	(parse_target_attributes): Allow up multiple HSA grid dimensions.
	(get_group_size): New function.
	(run_kernel): Likewise.
	(GOMP_OFFLOAD_run): Outline most functionality to run_kernel.
	(GOMP_OFFLOAD_fini_device): Call hsa run-time functions via hsa_fns
	structure.
	* testsuite/lib/libgomp.exp: Remove hsa_kmt_lib support.
	* testsuite/libgomp-test-support.exp.in: Likewise.
	* Makefile.in: Regenerated.
	* aclocal.m4: Likewise.
	* config.h.in: Likewise.
	* configure: Likewise.
	* testsuite/Makefile.in: Likewise.



Co-Authored-By: Martin Jambor <mjambor@suse.cz>

From-SVN: r242749
2016-11-23 13:27:13 +01:00
Cesar Philippidis 6668eb4593 nvptx.c (PTX_GANG_DEFAULT): Set to zero.
gcc/
	* config/nvptx/nvptx.c (PTX_GANG_DEFAULT): Set to zero.

	libgomp/
	* plugin/plugin-nvptx.c (nvptx_exec): Interrogate board attributes
	to determine default geometry.
	* testsuite/libgomp.oacc-c-c++-common/loop-auto-1.c: Set gang
	dimension.


Co-Authored-By: Nathan Sidwell <nathan@acm.org>

From-SVN: r241803
2016-11-02 15:10:02 -07:00
Chung-Lin Tang b4557008c4 oacc-plugin.h (GOMP_PLUGIN_async_unmap_vars): Add int parameter.
2016-05-26  Chung-Lin Tang  <cltang@codesourcery.com>

	libgomp/
	* oacc-plugin.h (GOMP_PLUGIN_async_unmap_vars): Add int parameter.
	* oacc-plugin.c (GOMP_PLUGIN_async_unmap_vars): Add 'int async'
	parameter, use to set async stream around call to gomp_unmap_vars,
	call gomp_unmap_vars() with 'do_copyfrom' set to true.
	* plugin/plugin-nvptx.c (struct ptx_event): Add 'int val' field.
	(event_gc): Adjust event handling loop, collect PTX_EVT_ASYNC_CLEANUP
	events and call GOMP_PLUGIN_async_unmap_vars() for each of them.
	(event_add): Add int parameter, initialize 'val' field when
	adding new ptx_event struct.
	(nvptx_evec): Adjust event_add() call arguments.
	(nvptx_host2dev): Likewise.
	(nvptx_dev2host): Likewise.
	(nvptx_wait_async): Likewise.
	(nvptx_wait_all_async): Likewise.
	(GOMP_OFFLOAD_openacc_register_async_cleanup): Add async parameter,
	pass to event_add() call.
	* oacc-host.c (host_openacc_register_async_cleanup): Add 'int async'
	parameter.
	* oacc-mem.c (gomp_acc_remove_pointer): Adjust async case to
	call openacc.register_async_cleanup_func() hook.
	* oacc-parallel.c (GOACC_parallel_keyed): Likewise.
	* target.c (gomp_copy_from_async): Delete function.
	(gomp_map_vars): Remove async_refcount.
	(gomp_unmap_vars): Likewise.
	(gomp_load_image_to_device): Likewise.
	(omp_target_associate_ptr): Likewise.
	* libgomp.h (struct splay_tree_key_s): Remove async_refcount.
	(acc_dispatch_t.register_async_cleanup_func): Add int parameter.
	(gomp_copy_from_async): Remove.

From-SVN: r236772
2016-05-26 13:28:25 +00:00
Chung-Lin Tang 6ce1307231 target.c (gomp_device_copy): New function.
libgomp/
2016-05-26  Chung-Lin Tang  <cltang@codesourcery.com>

	* target.c (gomp_device_copy): New function.
	(gomp_copy_host2dev): Likewise.
	(gomp_copy_dev2host): Likewise.
	(gomp_free_device_memory): Likewise.
	(gomp_map_vars_existing): Adjust to call gomp_copy_host2dev.
	(gomp_map_pointer): Likewise.
	(gomp_map_vars): Adjust to call gomp_copy_host2dev, handle
	NULL value from alloc_func plugin hook.
	(gomp_unmap_tgt): Adjust to call gomp_free_device_memory.
	(gomp_copy_from_async): Adjust to call gomp_copy_dev2host.
	(gomp_unmap_vars): Likewise.
	(gomp_update): Adjust to call gomp_copy_dev2host and
	gomp_copy_host2dev functions.
	(gomp_unload_image_from_device): Handle false value from
	unload_image_func plugin hook.
	(gomp_init_device): Handle false value from init_device_func
	plugin hook.
	(gomp_exit_data): Adjust to call gomp_copy_dev2host.
	(omp_target_free): Adjust to call gomp_free_device_memory.
	(omp_target_memcpy): Handle return values from host2dev_func,
	dev2host_func, and dev2dev_func plugin hooks.
	(omp_target_memcpy_rect_worker): Likewise.
	(gomp_target_fini): Handle false value from fini_device_func
	plugin hook.
	* libgomp.h (struct gomp_device_descr): Adjust return type of
	init_device_func, fini_device_func, unload_image_func, free_func,
	dev2host_func,host2dev_func, and dev2dev_func plugin hooks to 'bool'.
	* oacc-init.c (acc_shutdown_1): Handle false value from
	fini_device_func plugin hook.
	* oacc-host.c (host_init_device): Change return type to bool.
	(host_fini_device): Likewise.
	(host_unload_image): Likewise.
	(host_free): Likewise.
	(host_dev2host): Likewise.
	(host_host2dev): Likewise.
	* oacc-mem.c (acc_free): Handle plugin hook fatal error case.
	(acc_memcpy_to_device): Likewise.
	(acc_memcpy_from_device): Likewise.
	(delete_copyout): Add libfnname parameter, handle free_func
	hook fatal error case.
	(acc_delete): Adjust delete_copyout call.
	(acc_copyout): Likewise.
	(update_dev_host): Move gomp_mutex_unlock to after
	host2dev/dev2host hook calls.

	* plugin/plugin-hsa.c (hsa_warn): Adjust 'hsa_error' local variable
	to 'hsa_error_msg', for clarity.
	(hsa_fatal): Likewise.
	(hsa_error): New function.
	(init_hsa_context): Change return type to bool, adjust to return
	false on error.
	(GOMP_OFFLOAD_get_num_devices): Adjust to handle init_hsa_context
	return value.
	(GOMP_OFFLOAD_init_device): Change return type to bool, adjust to
	return false on error.
	(get_agent_info): Adjust to return NULL on error.
	(destroy_hsa_program): Change return type to bool, adjust to
	return false on error.
	(GOMP_OFFLOAD_load_image): Adjust to return -1 on error.
	(destroy_module): Change return type to bool, adjust to
	return false on error.
	(GOMP_OFFLOAD_unload_image): Likewise.
	(GOMP_OFFLOAD_fini_device): Likewise.
	(GOMP_OFFLOAD_alloc): Change to return NULL when called.
	(GOMP_OFFLOAD_free): Change to return false when called.
	(GOMP_OFFLOAD_dev2host): Likewise.
	(GOMP_OFFLOAD_host2dev): Likewise.
	(GOMP_OFFLOAD_dev2dev): Likewise.

	* plugin/plugin-nvptx.c (CUDA_CALL_ERET): New convenience macro.
	(CUDA_CALL): Likewise.
	(CUDA_CALL_ASSERT): Likewise.
	(map_init): Change return type to bool, use CUDA_CALL* macros.
	(map_fini): Likewise.
	(init_streams_for_device): Change return type to bool, adjust
	call to map_init.
	(fini_streams_for_device): Change return type to bool, adjust
	call to map_fini.
	(select_stream_for_async): Release stream_lock before calls to
	GOMP_PLUGIN_fatal, adjust call to map_init.
	(nvptx_init): Use CUDA_CALL* macros.
	(nvptx_attach_host_thread_to_device): Change return type to bool,
	use CUDA_CALL* macros.
	(nvptx_open_device): Use CUDA_CALL* macros.
	(nvptx_close_device): Change return type to bool, use CUDA_CALL*
	macros.
	(nvptx_get_num_devices): Use CUDA_CALL* macros.
	(link_ptx): Change return type to bool, use CUDA_CALL* macros.
	(nvptx_exec): Use CUDA_CALL* macros.
	(nvptx_alloc): Use CUDA_CALL* macros.
	(nvptx_free): Change return type to bool, use CUDA_CALL* macros.
	(nvptx_host2dev): Likewise.
	(nvptx_dev2host): Likewise.
	(nvptx_wait): Use CUDA_CALL* macros.
	(nvptx_wait_async): Likewise.
	(nvptx_wait_all): Likewise.
	(nvptx_wait_all_async): Likewise.
	(nvptx_set_cuda_stream): Adjust order of stream_lock acquire,
	use CUDA_CALL* macros, adjust call to map_fini.
	(GOMP_OFFLOAD_init_device): Change return type to bool,
	adjust code accordingly.
	(GOMP_OFFLOAD_fini_device): Likewise.
	(GOMP_OFFLOAD_load_image): Adjust calls to
	nvptx_attach_host_thread_to_device/link_ptx to handle errors,
	use CUDA_CALL* macros.
	(GOMP_OFFLOAD_unload_image): Change return type to bool, adjust
	return code.
	(GOMP_OFFLOAD_alloc): Adjust calls to code to handle error return.
	(GOMP_OFFLOAD_free): Change return type to bool, adjust calls to
	handle error return.
	(GOMP_OFFLOAD_dev2host): Likewise.
	(GOMP_OFFLOAD_host2dev): Likewise.
	(GOMP_OFFLOAD_openacc_register_async_cleanup): Use CUDA_CALL* macros.
	(GOMP_OFFLOAD_openacc_create_thread_data): Likewise.

liboffloadmic/
2016-05-26  Chung-Lin Tang  <cltang@codesourcery.com>

	* plugin/libgomp-plugin-intelmic.cpp (offload): Change return type
	to bool, adjust return code.
	(GOMP_OFFLOAD_init_device): Likewise.
	(GOMP_OFFLOAD_fini_device): Likewise.
	(get_target_table): Likewise.
	(offload_image): Likwise.
	(GOMP_OFFLOAD_load_image): Adjust call to offload_image(), change
	to return -1 on error.
	(GOMP_OFFLOAD_unload_image): Change return type to bool, adjust return
	code.
	(GOMP_OFFLOAD_alloc): Likewise.
	(GOMP_OFFLOAD_free): Likewise.
	(GOMP_OFFLOAD_host2dev): Likewise.
	(GOMP_OFFLOAD_dev2host): Likewise.
	(GOMP_OFFLOAD_dev2dev): Likewise.

From-SVN: r236768
2016-05-26 09:58:56 +00:00
Alexander Monakov c2bd3b6911 libgomp nvptx plugin: make cuMemFreeHost error non-fatal
From-SVN: r235339
2016-04-21 16:11:47 +03:00
Martin Liska f9c8babbab Properly assign to packet header (PR hsa/70394)
* plugin/plugin-hsa.c (packet_store_release): New function
	that is taken from the HSA runtime manual.
	(GOMP_OFFLOAD_run): Use the function.

From-SVN: r234454
2016-03-24 13:04:12 +00:00
Martin Liska 7397fce2f7 Copy shadow argument conditionally (PR hsa/70337)
PR hsa/70337
	* plugin/plugin-hsa.c (GOMP_OFFLOAD_run): Copy shadow
	argument just in case a dispatched kernel uses that argument.

From-SVN: r234418
2016-03-23 09:59:51 +00:00
Thomas Schwinge f99c355797 Use plain -fopenacc to enable OpenACC kernels processing
gcc/
	* tree-parloops.c (create_parallel_loop, gen_parallel_loop)
	(parallelize_loops): In OpenACC kernels mode, set n_threads to
	zero.
	(pass_parallelize_loops::gate): In OpenACC kernels mode, gate on
	flag_openacc.
	* tree-ssa-loop.c (gate_oacc_kernels): Likewise.
	gcc/testsuite/
	* c-c++-common/goacc/kernels-counter-vars-function-scope.c: Adjust
	to -ftree-parallelize-loops/-fopenacc changes.
	* c-c++-common/goacc/kernels-double-reduction-n.c: Likewise.
	* c-c++-common/goacc/kernels-double-reduction.c: Likewise.
	* c-c++-common/goacc/kernels-loop-2.c: Likewise.
	* c-c++-common/goacc/kernels-loop-3.c: Likewise.
	* c-c++-common/goacc/kernels-loop-g.c: Likewise.
	* c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise.
	* c-c++-common/goacc/kernels-loop-n.c: Likewise.
	* c-c++-common/goacc/kernels-loop-nest.c: Likewise.
	* c-c++-common/goacc/kernels-loop.c: Likewise.
	* c-c++-common/goacc/kernels-one-counter-var.c: Likewise.
	* c-c++-common/goacc/kernels-reduction.c: Likewise.
	* gfortran.dg/goacc/kernels-loop-inner.f95: Likewise.
	* gfortran.dg/goacc/kernels-loops-adjacent.f95: Likewise.
	libgomp/
	* oacc-parallel.c (GOACC_parallel_keyed): Initialize dims.
	* plugin/plugin-nvptx.c (nvptx_exec): Provide default values for
	dims.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c: Adjust to
	-ftree-parallelize-loops/-fopenacc changes.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-loop.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c:
	Likewise.

From-SVN: r233634
2016-02-23 16:07:54 +01:00
Thomas Schwinge 033ff3d130 libgomp: Use HSA_RUNTIME_LIB, HSA_KMT_LIB in the testsuite
libgomp/
	* plugin/configfrag.ac (HSA_KMT_LIB, HSA_KMT_LDFLAGS): New
	variables.
	* testsuite/libgomp-test-support.exp.in (hsa_runtime_lib)
	(hsa_kmt_lib): Set variables.
	* testsuite/lib/libgomp.exp (libgomp_init): Use them to amend
	always_ld_library_path.
	* Makefile.in: Regenerate.
	* configure: Likewise.
	* testsuite/Makefile.in: Likewise.

From-SVN: r233072
2016-02-02 13:48:31 +01:00
Thomas Schwinge 4a88d9b77a libgomp: For hsa offloading, compilation is all handled by the target compiler
libgomp/
	* plugin/configfrag.ac (offload_additional_options)
	(offload_additional_lib_paths): Don't amend for hsa offloading.
	* configure: Regenerate.

From-SVN: r233071
2016-02-02 13:48:21 +01:00
Thomas Schwinge 41d809d3c8 libgomp: Don't configure for offloading target if we don't build the corresponding plugin
libgomp/
	* plugin/configfrag.ac: Don't configure for offloading target if
	we don't build the corresponding plugin.
	* configure: Regenerate.

From-SVN: r233070
2016-02-02 13:48:04 +01:00
Martin Jambor b2b4005150 Merge of HSA
2016-01-19  Martin Jambor  <mjambor@suse.cz>
	    Martin Liska  <mliska@suse.cz>
	    Michael Matz <matz@suse.de>

libgomp/
	* plugin/Makefrag.am: Add HSA plugin requirements.
	* plugin/configfrag.ac (HSA_RUNTIME_INCLUDE): New variable.
	(HSA_RUNTIME_LIB): Likewise.
	(HSA_RUNTIME_CPPFLAGS): Likewise.
	(HSA_RUNTIME_INCLUDE): New substitution.
	(HSA_RUNTIME_LIB): Likewise.
	(HSA_RUNTIME_LDFLAGS): Likewise.
	(hsa-runtime): New configure option.
	(hsa-runtime-include): Likewise.
	(hsa-runtime-lib): Likewise.
	(PLUGIN_HSA): New substitution variable.
	Fill HSA_RUNTIME_INCLUDE and HSA_RUNTIME_LIB according to the new
	configure options.
	(PLUGIN_HSA_CPPFLAGS): Likewise.
	(PLUGIN_HSA_LDFLAGS): Likewise.
	(PLUGIN_HSA_LIBS): Likewise.
	Check that we have access to HSA run-time.
	* libgomp-plugin.h (offload_target_type): New element
	OFFLOAD_TARGET_TYPE_HSA.
	* libgomp.h (gomp_target_task): New fields firstprivate_copies and
	args.
	(bool gomp_create_target_task): Updated.
	(gomp_device_descr): Extra parameter of run_func and async_run_func,
	new field can_run_func.
	* libgomp_g.h (GOMP_target_ext): Update prototype.
	* oacc-host.c (host_run): Added a new parameter args.
	* target.c (calculate_firstprivate_requirements): New function.
	(copy_firstprivate_data): Likewise.
	(gomp_target_fallback_firstprivate): Use them.
	(gomp_target_unshare_firstprivate): New function.
	(gomp_get_target_fn_addr): Allow returning NULL for shared memory
	devices.
	(GOMP_target): Do host fallback for all shared memory devices.  Do not
	pass any args to plugins.
	(GOMP_target_ext): Introduce device-specific argument parameter args.
	Allow host fallback if device shares memory.  Do not remap data if
	device has shared memory.
	(gomp_target_task_fn): Likewise.  Also treat shared memory devices
	like host fallback for mappings.
	(GOMP_target_data): Treat shared memory devices like host fallback.
	(GOMP_target_data_ext): Likewise.
	(GOMP_target_update): Likewise.
	(GOMP_target_update_ext): Likewise.  Also pass NULL as args to
	gomp_create_target_task.
	(GOMP_target_enter_exit_data): Likewise.
	(omp_target_alloc): Treat shared memory devices like host fallback.
	(omp_target_free): Likewise.
	(omp_target_is_present): Likewise.
	(omp_target_memcpy): Likewise.
	(omp_target_memcpy_rect): Likewise.
	(omp_target_associate_ptr): Likewise.
	(gomp_load_plugin_for_device): Also load can_run.
	* task.c (GOMP_PLUGIN_target_task_completion): Free
	firstprivate_copies.
	(gomp_create_target_task): Accept new argument args and store it to
	ttask.
	* plugin/plugin-hsa.c: New file.

gcc/
	* Makefile.in (OBJS): Add new source files.
	(GTFILES): Add hsa.c.
	* common.opt (disable_hsa): New variable.
	(-Whsa): New warning.
	* config.in (ENABLE_HSA): New.
	* configure.ac: Treat hsa differently from other accelerators.
	(OFFLOAD_TARGETS): Define ENABLE_OFFLOADING according to
	$enable_offloading.
	(ENABLE_HSA): Define ENABLE_HSA according to $enable_hsa.
	* doc/install.texi (Configuration): Document --with-hsa-runtime,
	--with-hsa-runtime-include, --with-hsa-runtime-lib and
	--with-hsa-kmt-lib.
	* doc/invoke.texi (-Whsa): Document.
	(hsa-gen-debug-stores): Likewise.
	* lto-wrapper.c (compile_images_for_offload_targets): Do not attempt
	to invoke offload compiler for hsa acclerator.
	* opts.c (common_handle_option): Determine whether HSA offloading
	should be performed.
	* params.def (PARAM_HSA_GEN_DEBUG_STORES): New parameter.
	* builtin-types.def (BT_FN_VOID_UINT_PTR_INT_PTR): New.
	(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT): Removed.
	(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR): New.
	* gimple-low.c (lower_stmt): Also handle GIMPLE_OMP_GRID_BODY.
	* gimple-pretty-print.c (dump_gimple_omp_for): Also handle
	GF_OMP_FOR_KIND_GRID_LOOP.
	(dump_gimple_omp_block): Also handle GIMPLE_OMP_GRID_BODY.
	(pp_gimple_stmt_1): Likewise.
	* gimple-walk.c (walk_gimple_stmt): Likewise.
	* gimple.c (gimple_build_omp_grid_body): New function.
	(gimple_copy): Also handle GIMPLE_OMP_GRID_BODY.
	* gimple.def (GIMPLE_OMP_GRID_BODY): New.
	* gimple.h (enum gf_mask): Added GF_OMP_PARALLEL_GRID_PHONY,
	GF_OMP_FOR_KIND_GRID_LOOP, GF_OMP_FOR_GRID_PHONY and
	GF_OMP_TEAMS_GRID_PHONY.
	(gimple_statement_omp_single_layout): Updated comments.
	(gimple_build_omp_grid_body): New function.
	(gimple_has_substatements): Also handle GIMPLE_OMP_GRID_BODY.
	(gimple_omp_for_grid_phony): New function.
	(gimple_omp_for_set_grid_phony): Likewise.
	(gimple_omp_parallel_grid_phony): Likewise.
	(gimple_omp_parallel_set_grid_phony): Likewise.
	(gimple_omp_teams_grid_phony): Likewise.
	(gimple_omp_teams_set_grid_phony): Likewise.
	(gimple_return_set_retbnd): Also handle GIMPLE_OMP_GRID_BODY.
	* omp-builtins.def (BUILT_IN_GOMP_OFFLOAD_REGISTER): New.
	(BUILT_IN_GOMP_OFFLOAD_UNREGISTER): Likewise.
	(BUILT_IN_GOMP_TARGET): Updated type.
	* omp-low.c: Include symbol-summary.h, hsa.h and params.h.
	(adjust_for_condition): New function.
	(get_omp_for_step_from_incr): Likewise.
	(extract_omp_for_data): Moved parts to adjust_for_condition and
	get_omp_for_step_from_incr.
	(build_outer_var_ref): Handle GIMPLE_OMP_GRID_BODY.
	(fixup_child_record_type): Bail out if receiver_decl is NULL.
	(scan_sharing_clauses): Handle OMP_CLAUSE__GRIDDIM_.
	(scan_omp_parallel): Do not create child functions for phony
	constructs.
	(check_omp_nesting_restrictions): Handle GIMPLE_OMP_GRID_BODY.
	(scan_omp_1_op): Checking assert we are not remapping to
	ERROR_MARK.  Also also handle GIMPLE_OMP_GRID_BODY.
	(parallel_needs_hsa_kernel_p): New function.
	(expand_parallel_call): Register apprpriate parallel child
	functions as HSA kernels.
	(grid_launch_attributes_trees): New type.
	(grid_attr_trees): New variable.
	(grid_create_kernel_launch_attr_types): New function.
	(grid_insert_store_range_dim): Likewise.
	(grid_get_kernel_launch_attributes): Likewise.
	(get_target_argument_identifier_1): Likewise.
	(get_target_argument_identifier): Likewise.
	(get_target_argument_value): Likewise.
	(push_target_argument_according_to_value): Likewise.
	(get_target_arguments): Likewise.
	(expand_omp_target): Call get_target_arguments instead of looking
	up for teams and thread limit.
	(grid_expand_omp_for_loop): New function.
	(grid_arg_decl_map): New type.
	(grid_remap_kernel_arg_accesses): New function.
	(grid_expand_target_kernel_body): New function.
	(expand_omp): Call it.
	(lower_omp_for): Do not emit phony constructs.
	(lower_omp_taskreg): Do not emit phony constructs but create for them
	a temporary variable receiver_decl.
	(lower_omp_taskreg): Do not emit phony constructs.
	(lower_omp_teams): Likewise.
	(lower_omp_grid_body): New function.
	(lower_omp_1): Call it.
	(grid_reg_assignment_to_local_var_p): New function.
	(grid_seq_only_contains_local_assignments): Likewise.
	(grid_find_single_omp_among_assignments_1): Likewise.
	(grid_find_single_omp_among_assignments): Likewise.
	(grid_find_ungridifiable_statement): Likewise.
	(grid_target_follows_gridifiable_pattern): Likewise.
	(grid_remap_prebody_decls): Likewise.
	(grid_copy_leading_local_assignments): Likewise.
	(grid_process_kernel_body_copy): Likewise.
	(grid_attempt_target_gridification): Likewise.
	(grid_gridify_all_targets_stmt): Likewise.
	(grid_gridify_all_targets): Likewise.
	(execute_lower_omp): Call grid_gridify_all_targets.
	(make_gimple_omp_edges): Handle GIMPLE_OMP_GRID_BODY.
	* tree-core.h (omp_clause_code): Added OMP_CLAUSE__GRIDDIM_.
	(tree_omp_clause): Added union field dimension.
	* tree-pretty-print.c (dump_omp_clause): Handle OMP_CLAUSE__GRIDDIM_.
	* tree.c (omp_clause_num_ops): Added number of arguments of
	OMP_CLAUSE__GRIDDIM_.
	(omp_clause_code_name): Added name of OMP_CLAUSE__GRIDDIM_.
	(walk_tree_1): Handle OMP_CLAUSE__GRIDDIM_.
	* tree.h (OMP_CLAUSE_GRIDDIM_DIMENSION): New.
	(OMP_CLAUSE_SET_GRIDDIM_DIMENSION): Likewise.
	(OMP_CLAUSE_GRIDDIM_SIZE): Likewise.
	(OMP_CLAUSE_GRIDDIM_GROUP): Likewise.
	* passes.def: Schedule pass_ipa_hsa and pass_gen_hsail.
	* tree-pass.h (make_pass_gen_hsail): Declare.
	(make_pass_ipa_hsa): Likewise.
	* ipa-hsa.c: New file.
	* lto-section-in.c (lto_section_name): Add hsa section name.
	* lto-streamer.h (lto_section_type): Add hsa section.
	* timevar.def (TV_IPA_HSA): New.
        * hsa-brig-format.h: New file.
	* hsa-brig.c: New file.
	* hsa-dump.c: Likewise.
	* hsa-gen.c: Likewise.
	* hsa.c: Likewise.
	* hsa.h: Likewise.
	* toplev.c (compile_file): Call hsa_output_brig.
	* hsa-regalloc.c: New file.

gcc/fortran/
	* types.def (BT_FN_VOID_UINT_PTR_INT_PTR): New.
	(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_INT_INT): Removed.
	(BT_FN_VOID_INT_OMPFN_SIZE_PTR_PTR_PTR_UINT_PTR_PTR): New.

gcc/lto/
	* lto-partition.c: Include "hsa.h"
	(add_symbol_to_partition_1): Put hsa implementations into the
	same partition as host implementations.

liboffloadmic/
	* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_async_run): New
	unused parameter.
	(GOMP_OFFLOAD_run): Likewise.

include/
	* gomp-constants.h (GOMP_DEVICE_HSA): New macro.
	(GOMP_VERSION_HSA): Likewise.
	(GOMP_TARGET_ARG_DEVICE_MASK): Likewise.
	(GOMP_TARGET_ARG_DEVICE_ALL): Likewise.
	(GOMP_TARGET_ARG_SUBSEQUENT_PARAM): Likewise.
	(GOMP_TARGET_ARG_ID_MASK): Likewise.
	(GOMP_TARGET_ARG_NUM_TEAMS): Likewise.
	(GOMP_TARGET_ARG_THREAD_LIMIT): Likewise.
	(GOMP_TARGET_ARG_VALUE_SHIFT): Likewise.
	(GOMP_TARGET_ARG_HSA_KERNEL_ATTRIBUTES): Likewise.

From-SVN: r232549
2016-01-19 11:35:10 +01:00
Alexander Monakov 0d58938ed7 nvptx plugin: do not force JIT target SM version
When link_ptx runs, a CUDA device is already bound to current thread, so the
driver library knows the target architecture.  There isn't any benefit from
forcing a specific target here; on the contrary, hardcoding sm_30 breaks
offloading on later (Maxwell, sm_5x) devices.
    
	* plugin/plugin-nvptx.c (link_ptx): Do not set CU_JIT_TARGET.

From-SVN: r232227
2016-01-11 15:55:31 +03:00
Jakub Jelinek 818ab71a41 Update copyright years.
From-SVN: r232055
2016-01-04 15:30:50 +01:00
Nathan Sidwell 5c06742f6f libgomp.h (struct acc_dispatch_t): Remove args from exec_func.
* libgomp.h (struct acc_dispatch_t): Remove args from exec_func.
	* plugin/plugin-nvptx.c (nvptx_exec): Remove sizes & kinds arg.
	(GOMP_OFFLOAD_openacc_parallel): Likewise.
	* oacc-host.c (host_openacc_exec): Likewise.
	* oacc-parallel.c (GOACC_parallel_keyed): Adjust exec_func call.

From-SVN: r229721
2015-11-03 20:18:33 +00:00
Nathan Sidwell a1c1908bbd plugin-nvptx.c (nvptx_exec): Remove check on compute dimensions.
* plugin/plugin-nvptx.c (nvptx_exec): Remove check on compute
	dimensions.

From-SVN: r229471
2015-10-28 03:00:10 +00:00
Thomas Schwinge 113020dc59 nvptx offloading linking
gcc/
	* config/nvptx/mkoffload.c (Kind, Vis): Remove enums.
	(Token, Stmt): Remove structs.
	(decls, vars, fns): Remove variables.
	(alloc_comment, append_stmt, is_keyword): Remove macros.
	(tokenize, write_token, write_tokens, alloc_stmt, rev_stmts)
	(write_stmt, write_stmts, parse_insn, parse_list_nosemi)
	(parse_init, parse_file): Remove functions.
	(read_file): Accept a pointer to a length and store into it.
	(process): Don't try to parse the input file, just write it out as
	a string, but looking for maps.  Also write out the length.
	(main): Don't use "-S" to compile PTX code.

	libgomp/
	* oacc-ptx.h: Remove file, moving its content into...
	* config/nvptx/fortran.c: ... here...
	* config/nvptx/oacc-init.c: ..., here...
	* config/nvptx/oacc-parallel.c: ..., and here.
	* config/nvptx/openacc.f90: New file.
	* plugin/plugin-nvptx.c: Don't include "oacc-ptx.h".
	(link_ptx): Don't link in predefined bits of PTX code.

Co-Authored-By: Bernd Schmidt <bernds@codesourcery.com>

From-SVN: r228418
2015-10-02 21:43:41 +02:00
Nathan Sidwell cc3cd79bc2 mkoffload.c (process): Change offload data format.
gcc/
	* config/nvptx/mkoffload.c (process): Change offload data format.

	libgomp/
	* plugin/plugin-nvptx.c (targ_fn_launch): Use GOMP_DIM_MAX.
	(struct targ_ptx_obj): New.
	(nvptx_tdata): Move earlier, change data format.
	(link_ptx): Take targ_ptx_obj ptr and count.  Allow multiple
	objects.
	(GOMP_OFFLOAD_load_image): Adjust.

Co-Authored-By: Bernd Schmidt <bernds@codesourcery.com>

From-SVN: r228308
2015-09-30 21:35:47 +00:00
Nathan Sidwell a12a043782 plugin-nvptx.c (ARRAYSIZE): Delete.
* plugin/plugin-nvptx.c (ARRAYSIZE): Delete.
	(cuda_errlist): Delete.
	(cuda_error): Reimplement.

From-SVN: r228265
2015-09-29 19:45:47 +00:00
Nathan Sidwell 3e32ee19a5 gomp-constants.h (GOMP_VERSION_NVIDIA_PTX): Increment.
inlude/
	* gomp-constants.h (GOMP_VERSION_NVIDIA_PTX): Increment.
	(GOMP_DIM_GANG, GOMP_DIM_WORKER, GOMP_DIM_VECTOR, GOMP_DIM_MAX,
	GOMP_DIM_MASK): New.
	(GOMP_LAUNCH_DIM, GOMP_LAUNCH_ASYNC, GOMP_LAUNCH_WAIT): New.
	(GOMP_LAUNCH_CODE_SHIFT, GOMP_LAUNCH_DEVICE_SHIFT,
	GOMP_LAUNCH_OP_SHIFT): New.
	(GOMP_LAUNCH_PACK, GOMP_LAUNCH_CODE, GOMP_LAUNCH_DEVICE,
	GOMP_LAUNCH_OP): New.
	(GOMP_LAUNCH_OP_MAX): New.

	libgomp/
	* libgomp.h (acc_dispatch_t): Replace separate geometry args with
	array.
	* libgomp.map (GOACC_parallel_keyed): New.
	* oacc-parallel.c (goacc_wait): Take pointer to va_list.  Adjust
	all callers.
	(GOACC_parallel_keyed): New interface.  Lose geometry arguments
	and take keyed varargs list.  Adjust call to exec_func.
	(GOACC_parallel): Force host fallback.
	* libgomp_g.h (GOACC_parallel): Remove.
	(GOACC_parallel_keyed): Declare.
	* plugin/plugin-nvptx.c (struct targ_fn_launch): New struct.
	(stuct targ_gn_descriptor): Replace name field with launch field.
	(nvptx_exec): Lose separate geometry args, take array.  Process
	dynamic dimensions and adjust.
	(struct nvptx_tdata): Replace fn_names field with fn_descs.
	(GOMP_OFFLOAD_load_image): Adjust for change in function table
	data.
	(GOMP_OFFLOAD_openacc_parallel): Adjust for change in dimension
	passing.
	* oacc-host.c (host_openacc_exec): Adjust for change in dimension
	passing.

	gcc/
	* config/nvptx/nvptx.c: Include omp-low.h and gomp-constants.h.
	(nvptx_record_offload_symbol): Record function execution geometry.
	* config/nvptx/mkoffload.c (process): Include launch geometry in
	function data.
	* omp-low.c (oacc_launch_pack): New.
	(replace_oacc_fn_attrib): New.
	(set_oacc_fn_attrib): New.
	(get_oacc_fn_attrib): New.
	(expand_omp_target): Create keyed varargs for GOACC_parallel call
	generation.
	* omp-low.h (get_oacc_fn_attrib): Declare.
	* builtin-types.def (DEF_FUNCTION_TyPE_VAR_6): New.
	(DEF_FUNCTION_TYPE_VAR_11): Delete.
	* tree.h (OMP_CLAUSE_EXPR): New.
	* omp-builtins.def (BUILT_IN_GOACC_PARALLEL): Change target fn name.

	gcc/lto/
	* lto-lang.c (DEF_FUNCTION_TYPE_VAR_6): New.
	(DEF_FUNCTION_TYPE_VAR_11): Delete.

	gcc/c-family/
	* c-common.c (DEF_FUNCTION_TYPE_VAR_6): New.
	(DEF_FUNCTION_TYPE_VAR_11): Delete.

	gcc/fortran/
	* f95-lang.c (DEF_FUNCTION_TYPE_VAR_6): New.
	(DEF_FUNCTION_TYPE_VAR_11): Delete.
	* types.def (DEF_FUNCTION_TYPE_VAR_6): New.
	(DEF_FUNCTION_TYPE_VAR_11): Delete.

	gcc/ada/
	* gcc-interface/utils.c (DEF_FUNCTION_TYPE_VAR_6): Define

From-SVN: r228220
2015-09-28 19:37:33 +00:00
Thomas Schwinge 64186aad5a Fix --enable-offload-targets/-foffload handling, pt. 1
gcc/
	* configure.ac (offload_targets, OFFLOAD_TARGETS): Separate
	offload targets by commas, not colons.
	* config.in: Regenerate.
	* configure: Likewise.
	* gcc.c (driver::maybe_putenv_COLLECT_LTO_WRAPPER): Due to that,
	instead of setting up the default offload targets here...
	(process_command): ..., do it here.
	libgomp/
	* plugin/configfrag.ac (OFFLOAD_TARGETS): Clarify that offload
	targets are separated by commas.
	* config.h.in: Regenerate.

From-SVN: r228053
2015-09-23 16:52:50 +02:00
Nathan Sidwell 2a21ff193a libgomp.map: Add 4.0.2 version.
libgomp/
	* libgomp.map: Add 4.0.2 version.
	* target.c (offload_image_descr): Add version field.
	(gomp_load_image_to_device): Add version argument.  Adjust plugin
	call.  Improve load mismatch diagnostic.
	(gomp_unload_image_from_device): Add version argument.  Adjust plugin
	call.
	(GOMP_offload_regster): Make stub function, move bulk to ...
	(GOMP_offload_register_ver): ... here.  Process version argument.
	(GOMP_offload_unregister): Make stub function, move bulk to ...
	(GOMP_offload_unregister_ver): ... here.  Process version argument.
	(gomp_init_device): Process version field.
	(gomp_unload_device): Process version field.
	(gomp_load_plugin_for_device): Reimplement DLSYM & DLSYM_OPT
	macros.  Check plugin version.
	* libgomp.h (gomp_device_descr): Add version function field.  Adjust
	loader and unloader types.
	* oacc-host.c: Include gomp-constants.h.
	(host_version): New.
	(host_load_image, host_unload_image): Adjust.
	(host_dispatch): Add host_version.
	* plugin/plugin-nvptx.c: Include gomp-constants.h.
	(GOMP_OFFLOAD_version): New.
	(GOMP_OFFLOAD_load_image): Add version arg and check it.
	(GOMP_OFFLOAD_unload_image): Likewise.
	* plugin/plugin-host.c: Include gomp-constants.h.
	(GOMP_OFFLOAD_version): New.
	(GOMP_OFFLOAD_load_image): Add version arg.
	(GOMP_OFFLOAD_unload_image): Likewise.

	liboffloadmic/
	* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_version): New.
	(GOMP_OFFLOAD_load_image): Add version arg and check it.
	(GOMP_OFFLOAD_unload_image): Likewise.

	include/
	* gomp-constants.h (GOMP_VERSION, GOMP_VERSION_NVIDIA_PTX,
	GOMP_VERSION_INTEL_MIC): New.
	(GOMP_VERSION_PACK, GOMP_VERSION_LIB, GOMP_VERSION_DEV): New.

	gcc/
	* config/nvptx/mkoffload.c (process): Replace
	GOMP_offload_{,un}register with GOMP_offload_{,un}register_ver.

From-SVN: r227137
2015-08-24 17:10:06 +00:00
Thomas Schwinge b97e78b712 [PR libgomp/65742, PR middle-end/66332] libgomp: Remove plugin for non-shared memory host execution
gcc/
	* builtins.c (expand_builtin_acc_on_device) [ACCEL_COMPILER]: Emit
	open-coded sequence.
	* omp-low.c (oacc_process_reduction_data): Remove handline of
	GOMP_DEVICE_HOST_NONSHM.
	gcc/testsuite/
	* c-c++-common/goacc/acc_on_device-2.c: Remove XFAIL for C.
	include/
	* gomp-constants.c (GOMP_DEVICE_HOST_NONSHM): Remove.
	libgomp/
	* libgomp-plugin.h (enum offload_target_type): Remove
	OFFLOAD_TARGET_TYPE_HOST_NONSHM.
	* openacc.f90 (openacc_kinds): Remove acc_device_host_nonshm.
	* openacc.h (enum acc_device_t): Likewise.
	* openacc_lib.h: Likewise.
	* oacc-init.c (name_of_acc_device_t): Don't handle it.
	(acc_on_device): Just use __builtin_acc_on_device.
	* testsuite/libgomp.oacc-c-c++-common/if-1.c: Don't forbid usage
	of acc_on_device builtin.
	* plugin/plugin-host.h: Remove file.
	* plugin/plugin-host.c: Likewise, but salvage some content into...
	* oacc-host.c: ... this file.
	* plugin/Makefrag.am: Don't build libgomp-plugin-host_nonshm.la.
	* plugin/configfrag.ac (offload_targets): Don't add host_nonshm.
	* Makefile.in: Regenerate.
	* configure: Likewise.
	* testsuite/lib/libgomp.exp
	(check_effective_target_openacc_host_nonshm_selected): Remove.
	* testsuite/libgomp.oacc-c++/c++.exp: Don't handle
	ACC_DEVICE_TYPE=host_nonshm.
	* testsuite/libgomp.oacc-c/c.exp: Likewise.
	* testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c: Likewise.
	* testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Likewise.
	* testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Likewise.

From-SVN: r226763
2015-08-10 18:48:26 +02:00
Nathan Sidwell 9ebddeb045 plugin-nvptx.c: Don't include dlfcn.h.
* plugin/plugin-nvptx.c: Don't include dlfcn.h.
	(cuda_errlist): Constify.
	(errmsg):  Move into ...
	(cuda_error): ... here.  Make smaller.
	(_XSTR, _STR): Delete.
	(cuda_synames): Delete.
	(verify_device_library): Delete.
	(nvptx_init): Don't call it.

From-SVN: r226539
2015-08-04 00:40:18 +00:00
Nathan Sidwell f3e9a059a7 plugin-nvptx.c (struct targ_fn_descriptor): Move later.
* plugin/plugin-nvptx.c (struct targ_fn_descriptor): Move later.
	(struct ptx_image_data): Move earlier, add fns field.
	(struct ptx_device): Add images and image_lock fields.
	(ptx_images, ptx_image_lock): Delete.
	(nvptx_open_device): Initialize images and image_lock fields.
	(nvptx_close_device): Destroy image_lock.
	(GOMP_OFFLOAD_load_image): Register image to device-specific fields.
	(GOMP_OFFLOAD_unload_image): Unregister image from device-specific
	fields.

From-SVN: r226004
2015-07-20 16:17:57 +00:00
Nathan Sidwell afb2d80bc5 mkoffload.c (process): Constify target data.
gcc/
	* config/nvptx/mkoffload.c (process): Constify target data.
	* config/i386/intelmic-mkoffload.c (generate_target_descr_file):
	Constify target data.
	(generate_target_offloadend_file): Likewise.

	libgomp/
	* libgomp.h (gomp_device_descr): Constify target data arguments.
	* target.c (struct offload_image_descr): Constify target_data.
	(gomp_offload_image_to_device): Likewise.
	(GOMP_offload_register): Likewise.
	(GOMP_offload_unrefister): Likewise.
	* plugin/plugin-host.c (GOMP_OFFLOAD_load_image,
	GOMP_OFFLOAD_unload_image): Constify target data.
	* plugin/plugin-nvptx.c (struct ptx_image_data): Constify target data.
	(GOMP_OFFLOAD_load_image, GOMP_OFFLOAD_unload_image): Likewise.

	liboffloadmic/
	* plugin/libgomp-plugin-intelmic.cpp (ImgDevAddrMap): Constify.
	(offload_image, GOMP_OFFLOAD_load_image,
	OMP_OFFLOAD_unload_image): Constify target data.

From-SVN: r225936
2015-07-17 14:07:53 +00:00
Nathan Sidwell a4cb876dc9 plugin-nvptx.c (link_ptx): Constify string argument.
libgomp/
	* plugin/plugin-nvptx.c (link_ptx): Constify string argument.
	Workaround driver library const error.
	(struct nvptx_tdata, nvptx_tdata_t): New.
	(GOMP_OFFLOAD_load_image): Use struct for target_data's real
	type.

	gcc/
	* config/nvptx/mkoffload.c (process): Constify mapping variables.
	Define target data struct and initialize it.

From-SVN: r225897
2015-07-16 17:17:31 +00:00
Thomas Schwinge a92defdab7 [nvptx offloading] Only 64-bit configurations are currently supported
PR libgomp/65099
	gcc/
	* config/nvptx/mkoffload.c (main): Create an offload image only in
	64-bit configurations.
	libgomp/
	* plugin/plugin-nvptx.c (nvptx_get_num_devices): Return 0 if not
	in a 64-bit configuration.
	* testsuite/libgomp.oacc-c++/c++.exp: Don't attempt nvidia
	offloading testing if no such device is available.
	* testsuite/libgomp.oacc-c/c.exp: Likewise.
	* testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.

From-SVN: r225560
2015-07-08 16:59:59 +02:00
Julian Brown 32eaed9380 re PR libgomp/65742 (Several libgomp.oacc-* failures after r221922.)
PR libgomp/65742

    gcc/
    * builtins.c (expand_builtin_acc_on_device): Don't use open-coded
    sequence for !ACCEL_COMPILER.

    libgomp/
    * oacc-init.c (plugin/plugin-host.h): Include.
    (acc_on_device): Check whether we're in an offloaded region for
    host_nonshm
    plugin. Don't use __builtin_acc_on_device.
    * plugin/plugin-host.c (GOMP_OFFLOAD_openacc_parallel): Set
    nonshm_exec flag in thread-local data.
    (GOMP_OFFLOAD_openacc_create_thread_data): Allocate thread-local
    data for host_nonshm plugin.
    (GOMP_OFFLOAD_openacc_destroy_thread_data): Free thread-local data
    for host_nonshm plugin.
    * plugin/plugin-host.h: New.

From-SVN: r223801
2015-05-28 09:29:19 +00:00
Julian Brown c831982647 plugin-nvptx.c (nvptx_get_num_devices): Return zero on cuInit failure.
* plugin/plugin-nvptx.c (nvptx_get_num_devices): Return zero
	on cuInit failure.

From-SVN: r223352
2015-05-19 11:06:31 +00:00
Julian Brown d93bdab53b mkoffload.c (process): Support variable mapping.
gcc/
	* config/nvptx/mkoffload.c (process): Support variable mapping.

	libgomp/
	* libgomp.h (target_mem_desc: Remove mem_map field.
	(acc_dispatch_t): Remove open_device_func, close_device_func,
	get_device_num_func, set_device_num_func, target_data members.
	Change create_thread_data_func argument to device number instead of
	generic pointer.
	* oacc-async.c (assert.h): Include.
	(acc_async_test, acc_async_test_all, acc_wait, acc_wait_async)
	(acc_wait_all, acc_wait_all_async): Use current host thread's
	active device, not base_dev.
	* oacc-cuda.c (acc_get_current_cuda_device)
	(acc_get_current_cuda_context, acc_get_cuda_stream)
	(acc_set_cuda_stream): Likewise.
	* oacc-host.c (host_dispatch): Don't set open_device_func,
	close_device_func, get_device_num_func or set_device_num_func.
	* oacc-init.c (base_dev, init_key): Remove.
	(cached_base_dev): New.
	(name_of_acc_device_t): New.
	(acc_init_1): Initialise default-numbered device, not zeroth.
	(acc_shutdown_1): Close all devices of a given type.
	(goacc_destroy_thread): Don't use base_dev.
	(lazy_open, lazy_init, lazy_init_and_open): Remove.
	(goacc_attach_host_thread_to_device): New.
	(acc_init): Reimplement with goacc_attach_host_thread_to_device.
	(acc_get_num_devices): Don't use base_dev.
	(acc_set_device_type): Reimplement.
	(acc_get_device_type): Don't use base_dev.
	(acc_get_device_num): Tweak logic.
	(acc_set_device_num): Likewise.
	(acc_on_device): Use acc_get_device_type.
	(goacc_runtime_initialize): Initialize cached_base_dev not base_dev.
	(goacc_lazy_initialize): Reimplement with acc_init and
	goacc_attach_host_thread_to_device.
	* oacc-int.h (goacc_thread): Add base_dev field.
	(base_dev): Remove extern declaration.
	(goacc_attach_host_thread_to_device): Add prototype.
	* oacc-mem.c (acc_malloc): Use current thread's device instead of
	base_dev.
	(acc_free): Likewise.
	(acc_memcpy_to_device): Likewise.
	(acc_memcpy_from_device): Likewise.
	* oacc-parallel.c (select_acc_device): Remove. Replace calls with
	goacc_lazy_initialize (throughout).
	(GOACC_parallel): Use tgt_offset to locate target functions.
	* target.c (gomp_map_vars): Don't set tgt->mem_map.
	(gomp_unmap_vars): Use devicep->mem_map pointer not tgt->mem_map.
	(gomp_load_plugin_for_device): Remove open_device, close_device,
	get_device_num, set_device_num openacc hook initialisation. Don't set
	openacc.target_data.
	* plugin/plugin-host.c (GOMP_OFFLOAD_openacc_open_device)
	(GOMP_OFFLOAD_openacc_close_device)
	(GOMP_OFFLOAD_openacc_get_device_num)
	(GOMP_OFFLOAD_openacc_set_device_num): Remove.
	(GOMP_OFFLOAD_openacc_create_thread_data): Change (unused) argument
	to int.
	* plugin/plugin-nvptx.c (ptx_inited): Remove.
	(instantiated_devices, ptx_dev_lock): New.
	(struct ptx_image_data): New.
	(ptx_devices, ptx_images, ptx_image_lock): New.
	(fini_streams_for_device): Reorder cuStreamDestroy call.
	(nvptx_get_num_devices): Remove forward declaration.
	(nvptx_init): Change return type to bool.
	(nvptx_fini): Remove.
	(nvptx_attach_host_thread_to_device): New.
	(nvptx_open_device): Return struct ptx_device* instead of void*.
	(nvptx_close_device): Change argument type to struct ptx_device*,
	return type to void.
	(nvptx_get_num_devices): Use instantiated_devices not ptx_inited.
	(kernel_target_data, kernel_host_table): Remove static globals.
	(GOMP_OFFLOAD_register_image, GOMP_OFFLOAD_get_table): Remove.
	(GOMP_OFFLOAD_init_device): Reimplement.
	(GOMP_OFFLOAD_fini_device): Likewise.
	(GOMP_OFFLOAD_load_image, GOMP_OFFLOAD_unload_image): New.
	(GOMP_OFFLOAD_alloc, GOMP_OFFLOAD_free, GOMP_OFFLOAD_dev2host)
	(GOMP_OFFLOAD_host2dev): Use ORD argument.
	(GOMP_OFFLOAD_openacc_open_device)
	(GOMP_OFFLOAD_openacc_close_device)
	(GOMP_OFFLOAD_openacc_set_device_num)
	(GOMP_OFFLOAD_openacc_get_device_num): Remove.
	(GOMP_OFFLOAD_openacc_create_thread_data): Change argument to int
	(device number).

	libgomp/testsuite/
	* libgomp.oacc-c-c++-common/lib-9.c: Fix devnum check in test.

From-SVN: r221922
2015-04-08 15:58:33 +00:00
Ilya Verbin a51df54e48 libgomp: rework initialization of offloading
gcc/
	* config/i386/intelmic-mkoffload.c (generate_host_descr_file): Call
	GOMP_offload_unregister from the destructor.
libgomp/
	* libgomp-plugin.h (struct mapping_table): Replace with addr_pair.
	* libgomp.h (struct gomp_memory_mapping): Remove.
	(struct target_mem_desc): Change type of mem_map from
	gomp_memory_mapping * to splay_tree_s *.
	(struct gomp_device_descr): Remove register_image_func, get_table_func.
	Add load_image_func, unload_image_func.
	Change type of mem_map from gomp_memory_mapping to splay_tree_s.
	Remove offload_regions_registered.
	(gomp_init_tables): Remove.
	(gomp_free_memmap): Change type of argument from gomp_memory_mapping *
	to splay_tree_s *.
	* libgomp.map (GOMP_4.0.1): Add GOMP_offload_unregister.
	* oacc-host.c (host_dispatch): Do not initialize register_image_func,
	get_table_func, mem_map.is_initialized, mem_map.splay_tree.root,
	offload_regions_registered.
	Initialize load_image_func, unload_image_func, mem_map.root.
	(goacc_host_init): Do not initialize host_dispatch.mem_map.lock.
	* oacc-init.c (lazy_open): Don't call gomp_init_tables.
	(acc_shutdown_1): Use dev's lock and splay_tree instead of mem_map's.
	* oacc-mem.c (lookup_host): Get gomp_device_descr *dev instead of
	gomp_memory_mapping *.  Use dev's lock and splay_tree.
	(lookup_dev): Use dev's lock.
	(acc_deviceptr): Pass dev to lookup_host instead of mem_map.
	(acc_is_present): Likewise.
	(acc_map_data): Likewise.
	(acc_unmap_data): Likewise.  Use dev's lock.
	(present_create_copy): Likewise.
	(delete_copyout): Pass dev to lookup_host instead of mem_map.
	(update_dev_host): Likewise.
	(gomp_acc_remove_pointer): Likewise.  Use dev's lock.
	* oacc-parallel.c (GOACC_parallel): Use dev's lock and splay_tree.
	* plugin/plugin-host.c (GOMP_OFFLOAD_register_image): Remove.
	(GOMP_OFFLOAD_get_table): Remove
	(GOMP_OFFLOAD_load_image): New function.
	(GOMP_OFFLOAD_unload_image): New function.
	* target.c (register_lock): New mutex for offload image registration.
	(num_devices): Do not guard with PLUGIN_SUPPORT.
	(gomp_realloc_unlock): New static function.
	(gomp_map_vars_existing): Add device descriptor argument.  Unlock mutex
	before gomp_fatal.
	(gomp_map_vars): Use dev's lock and splay_tree instead of mem_map's.
	Pass devicep to gomp_map_vars_existing.  Unlock mutex before gomp_fatal.
	(gomp_copy_from_async): Use dev's lock and splay_tree instead of
	mem_map's.
	(gomp_unmap_vars): Likewise.
	(gomp_update): Remove gomp_memory_mapping argument.  Use dev's lock and
	splay_tree instead of mm's.  Unlock mutex before gomp_fatal.
	(gomp_offload_image_to_device): New static function.
	(GOMP_offload_register): Add mutex lock.
	Call gomp_offload_image_to_device for all initialized devices.
	Replace gomp_realloc with gomp_realloc_unlock.
	(GOMP_offload_unregister): New function.
	(gomp_init_tables): Replace with gomp_init_device.  Replace a call to
	get_table_func from the plugin with calls to init_device_func and
	gomp_offload_image_to_device.
	(gomp_free_memmap): Change type of argument from gomp_memory_mapping *
	to splay_tree_s *.
	(GOMP_target): Do not call gomp_init_tables.  Use dev's lock and
	splay_tree instead of mem_map's.  Unlock mutex before gomp_fatal.
	(GOMP_target_data): Do not call gomp_init_tables.
	(GOMP_target_update): Likewise.  Remove argument from gomp_update.
	(gomp_load_plugin_for_device): Replace register_image and get_table
	with load_image and unload_image in DLSYM ().
	(gomp_register_images_for_device): Remove function.
	(gomp_target_init): Do not initialize current_device.mem_map.*,
	current_device.offload_regions_registered.
	Remove call to gomp_register_images_for_device.
	Do not free offload_images and num_offload_images.
liboffloadmic/
	* plugin/libgomp-plugin-intelmic.cpp: Include map.
	(AddrVect, DevAddrVect, ImgDevAddrMap): New typedefs.
	(num_devices, num_images, address_table): New static vars.
	(num_libraries, lib_descrs): Remove static vars.
	(set_mic_lib_path): Rename to ...
	(init): ... this.  Allocate address_table and get num_devices.
	(GOMP_OFFLOAD_get_num_devices): return num_devices.
	(load_lib_and_get_table): Remove static function.
	(offload_image): New static function.
	(GOMP_OFFLOAD_get_table): Remove function.
	(GOMP_OFFLOAD_load_image, GOMP_OFFLOAD_unload_image): New functions.

From-SVN: r221878
2015-04-06 12:40:28 +00:00
Thomas Schwinge 41dbbb3789 Merge current set of OpenACC changes from gomp-4_0-branch.
contrib/
	* gcc_update (files_and_dependencies): Update rules for new
	libgomp/plugin/Makefrag.am and libgomp/plugin/configfrag.ac files.
	gcc/
	* builtin-types.def (BT_FN_VOID_INT_INT_VAR)
	(BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR_INT_INT_VAR)
	(BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR_INT_INT_INT_INT_INT_VAR):
	New function types.
	* builtins.c: Include "gomp-constants.h".
	(expand_builtin_acc_on_device): New function.
	(expand_builtin, is_inexpensive_builtin): Handle
	BUILT_IN_ACC_ON_DEVICE.
	* builtins.def (DEF_GOACC_BUILTIN, DEF_GOACC_BUILTIN_COMPILER):
	New macros.
	* cgraph.c (cgraph_node::create): Consider flag_openacc next to
	flag_openmp.
	* config.gcc <nvptx-*> (tm_file): Add nvptx/offload.h.
	<*-intelmic-* | *-intelmicemul-*> (tm_file): Add
	i386/intelmic-offload.h.
	* gcc.c (LINK_COMMAND_SPEC, GOMP_SELF_SPECS): For -fopenacc, link
	to libgomp and its dependencies.
	* config/arc/arc.h (LINK_COMMAND_SPEC): Likewise.
	* config/darwin.h (LINK_COMMAND_SPEC_A): Likewise.
	* config/i386/mingw32.h (GOMP_SELF_SPECS): Likewise.
	* config/ia64/hpux.h (LIB_SPEC): Likewise.
	* config/pa/pa-hpux11.h (LIB_SPEC): Likewise.
	* config/pa/pa64-hpux.h (LIB_SPEC): Likewise.
	* doc/generic.texi: Update for OpenACC changes.
	* doc/gimple.texi: Likewise.
	* doc/invoke.texi: Likewise.
	* doc/sourcebuild.texi: Likewise.
	* gimple-pretty-print.c (dump_gimple_omp_for): Handle
	GF_OMP_FOR_KIND_OACC_LOOP.
	(dump_gimple_omp_target): Handle GF_OMP_TARGET_KIND_OACC_KERNELS,
	GF_OMP_TARGET_KIND_OACC_PARALLEL, GF_OMP_TARGET_KIND_OACC_DATA,
	GF_OMP_TARGET_KIND_OACC_UPDATE,
	GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA.
	Dump more data.
	* gimple.c: Update comments for OpenACC changes.
	* gimple.def: Likewise.
	* gimple.h: Likewise.
	(enum gf_mask): Add GF_OMP_FOR_KIND_OACC_LOOP,
	GF_OMP_TARGET_KIND_OACC_PARALLEL, GF_OMP_TARGET_KIND_OACC_KERNELS,
	GF_OMP_TARGET_KIND_OACC_DATA, GF_OMP_TARGET_KIND_OACC_UPDATE,
	GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA.
	(gimple_omp_for_cond, gimple_omp_for_set_cond): Sort in the
	appropriate place.
	(is_gimple_omp_oacc, is_gimple_omp_offloaded): New functions.
	* gimplify.c: Include "gomp-constants.h".
	Update comments for OpenACC changes.
	(is_gimple_stmt): Handle OACC_PARALLEL, OACC_KERNELS, OACC_DATA,
	OACC_HOST_DATA, OACC_DECLARE, OACC_UPDATE, OACC_ENTER_DATA,
	OACC_EXIT_DATA, OACC_CACHE, OACC_LOOP.
	(gimplify_scan_omp_clauses, gimplify_adjust_omp_clauses): Handle
	OMP_CLAUSE__CACHE_, OMP_CLAUSE_ASYNC, OMP_CLAUSE_WAIT,
	OMP_CLAUSE_NUM_GANGS, OMP_CLAUSE_NUM_WORKERS,
	OMP_CLAUSE_VECTOR_LENGTH, OMP_CLAUSE_GANG, OMP_CLAUSE_WORKER,
	OMP_CLAUSE_VECTOR, OMP_CLAUSE_DEVICE_RESIDENT,
	OMP_CLAUSE_USE_DEVICE, OMP_CLAUSE_INDEPENDENT, OMP_CLAUSE_AUTO,
	OMP_CLAUSE_SEQ.
	(gimplify_adjust_omp_clauses_1, gimplify_adjust_omp_clauses): Use
	GOMP_MAP_* instead of OMP_CLAUSE_MAP_*.  Use
	OMP_CLAUSE_SET_MAP_KIND.
	(gimplify_oacc_cache): New function.
	(gimplify_omp_for): Handle OACC_LOOP.
	(gimplify_omp_workshare): Handle OACC_KERNELS, OACC_PARALLEL,
	OACC_DATA.
	(gimplify_omp_target_update): Handle OACC_ENTER_DATA,
	OACC_EXIT_DATA, OACC_UPDATE.
	(gimplify_expr): Handle OACC_LOOP, OACC_CACHE, OACC_HOST_DATA,
	OACC_DECLARE, OACC_KERNELS, OACC_PARALLEL, OACC_DATA,
	OACC_ENTER_DATA, OACC_EXIT_DATA, OACC_UPDATE.
	(gimplify_body): Consider flag_openacc next to flag_openmp.
	* lto-streamer-out.c: Include "gomp-constants.h".
	* omp-builtins.def (BUILT_IN_ACC_GET_DEVICE_TYPE)
	(BUILT_IN_GOACC_DATA_START, BUILT_IN_GOACC_DATA_END)
	(BUILT_IN_GOACC_ENTER_EXIT_DATA, BUILT_IN_GOACC_PARALLEL)
	(BUILT_IN_GOACC_UPDATE, BUILT_IN_GOACC_WAIT)
	(BUILT_IN_GOACC_GET_THREAD_NUM, BUILT_IN_GOACC_GET_NUM_THREADS)
	(BUILT_IN_ACC_ON_DEVICE): New builtins.
	* omp-low.c: Include "gomp-constants.h".
	Update comments for OpenACC changes.
	(struct omp_context): Add reduction_map, gwv_below, gwv_this
	members.
	(extract_omp_for_data, use_pointer_for_field, install_var_field)
	(new_omp_context, delete_omp_context, scan_sharing_clauses)
	(create_omp_child_function, scan_omp_for, scan_omp_target)
	(check_omp_nesting_restrictions, lower_reduction_clauses)
	(build_omp_regions_1, diagnose_sb_0, make_gimple_omp_edges):
	Update for OpenACC changes.
	(scan_sharing_clauses): Handle OMP_CLAUSE_NUM_GANGS:
	OMP_CLAUSE_NUM_WORKERS: OMP_CLAUSE_VECTOR_LENGTH,
	OMP_CLAUSE_ASYNC, OMP_CLAUSE_WAIT, OMP_CLAUSE_GANG,
	OMP_CLAUSE_WORKER, OMP_CLAUSE_VECTOR, OMP_CLAUSE_DEVICE_RESIDENT,
	OMP_CLAUSE_USE_DEVICE, OMP_CLAUSE__CACHE_, OMP_CLAUSE_INDEPENDENT,
	OMP_CLAUSE_AUTO, OMP_CLAUSE_SEQ.  Use GOMP_MAP_* instead of
	OMP_CLAUSE_MAP_*.
	(expand_omp_for_static_nochunk, expand_omp_for_static_chunk):
	Handle GF_OMP_FOR_KIND_OACC_LOOP.
	(expand_omp_target, lower_omp_target): Handle
	GF_OMP_TARGET_KIND_OACC_PARALLEL, GF_OMP_TARGET_KIND_OACC_KERNELS,
	GF_OMP_TARGET_KIND_OACC_UPDATE,
	GF_OMP_TARGET_KIND_OACC_ENTER_EXIT_DATA,
	GF_OMP_TARGET_KIND_OACC_DATA.
	(pass_expand_omp::execute, execute_lower_omp)
	(pass_diagnose_omp_blocks::gate): Consider flag_openacc next to
	flag_openmp.
	(offload_symbol_decl): New variable.
	(oacc_get_reduction_array_id, oacc_max_threads)
	(get_offload_symbol_decl, get_base_type, lookup_oacc_reduction)
	(maybe_lookup_oacc_reduction, enclosing_target_ctx)
	(oacc_loop_or_target_p, oacc_lower_reduction_var_helper)
	(oacc_gimple_assign, oacc_initialize_reduction_data)
	(oacc_finalize_reduction_data, oacc_process_reduction_data): New
	functions.
	(is_targetreg_ctx): Remove function.
	* tree-core.h (enum omp_clause_code): Add OMP_CLAUSE__CACHE_,
	OMP_CLAUSE_DEVICE_RESIDENT, OMP_CLAUSE_USE_DEVICE,
	OMP_CLAUSE_GANG, OMP_CLAUSE_ASYNC, OMP_CLAUSE_WAIT,
	OMP_CLAUSE_AUTO, OMP_CLAUSE_SEQ, OMP_CLAUSE_INDEPENDENT,
	OMP_CLAUSE_WORKER, OMP_CLAUSE_VECTOR, OMP_CLAUSE_NUM_GANGS,
	OMP_CLAUSE_NUM_WORKERS, OMP_CLAUSE_VECTOR_LENGTH.
	* tree.c (omp_clause_code_name, walk_tree_1): Update accordingly.
	* tree.h (OMP_CLAUSE_GANG_EXPR, OMP_CLAUSE_GANG_STATIC_EXPR)
	(OMP_CLAUSE_ASYNC_EXPR, OMP_CLAUSE_WAIT_EXPR)
	(OMP_CLAUSE_VECTOR_EXPR, OMP_CLAUSE_WORKER_EXPR)
	(OMP_CLAUSE_NUM_GANGS_EXPR, OMP_CLAUSE_NUM_WORKERS_EXPR)
	(OMP_CLAUSE_VECTOR_LENGTH_EXPR): New macros.
	* tree-core.h: Update comments for OpenACC changes.
	(enum omp_clause_map_kind): Remove.
	(struct tree_omp_clause): Change type of map_kind member from enum
	omp_clause_map_kind to unsigned char.
	* tree-inline.c: Update comments for OpenACC changes.
	* tree-nested.c: Likewise.  Include "gomp-constants.h".
	(convert_nonlocal_reference_stmt, convert_local_reference_stmt)
	(convert_tramp_reference_stmt, convert_gimple_call): Update for
	OpenACC changes.  Use GOMP_MAP_* instead of OMP_CLAUSE_MAP_*.  Use
	OMP_CLAUSE_SET_MAP_KIND.
	* tree-pretty-print.c: Include "gomp-constants.h".
	(dump_omp_clause): Handle OMP_CLAUSE_DEVICE_RESIDENT,
	OMP_CLAUSE_USE_DEVICE, OMP_CLAUSE__CACHE_, OMP_CLAUSE_GANG,
	OMP_CLAUSE_ASYNC, OMP_CLAUSE_AUTO, OMP_CLAUSE_SEQ,
	OMP_CLAUSE_WAIT, OMP_CLAUSE_WORKER, OMP_CLAUSE_VECTOR,
	OMP_CLAUSE_NUM_GANGS, OMP_CLAUSE_NUM_WORKERS,
	OMP_CLAUSE_VECTOR_LENGTH, OMP_CLAUSE_INDEPENDENT.  Use GOMP_MAP_*
	instead of OMP_CLAUSE_MAP_*.
	(dump_generic_node): Handle OACC_PARALLEL, OACC_KERNELS,
	OACC_DATA, OACC_HOST_DATA, OACC_DECLARE, OACC_UPDATE,
	OACC_ENTER_DATA, OACC_EXIT_DATA, OACC_CACHE, OACC_LOOP.
	* tree-streamer-in.c: Include "gomp-constants.h".
	(unpack_ts_omp_clause_value_fields) Use GOMP_MAP_* instead of
	OMP_CLAUSE_MAP_*.  Use OMP_CLAUSE_SET_MAP_KIND.
	* tree-streamer-out.c: Include "gomp-constants.h".
	(pack_ts_omp_clause_value_fields): Use GOMP_MAP_* instead of
	OMP_CLAUSE_MAP_*.
	* tree.def (OACC_PARALLEL, OACC_KERNELS, OACC_DATA)
	(OACC_HOST_DATA, OACC_LOOP, OACC_CACHE, OACC_DECLARE)
	(OACC_ENTER_DATA, OACC_EXIT_DATA, OACC_UPDATE): New tree codes.
	* tree.c (omp_clause_num_ops): Update accordingly.
	* tree.h (OMP_BODY, OMP_CLAUSES, OMP_LOOP_CHECK, OMP_CLAUSE_SIZE):
	Likewise.
	(OACC_PARALLEL_BODY, OACC_PARALLEL_CLAUSES, OACC_KERNELS_BODY)
	(OACC_KERNELS_CLAUSES, OACC_DATA_BODY, OACC_DATA_CLAUSES)
	(OACC_HOST_DATA_BODY, OACC_HOST_DATA_CLAUSES, OACC_CACHE_CLAUSES)
	(OACC_DECLARE_CLAUSES, OACC_ENTER_DATA_CLAUSES)
	(OACC_EXIT_DATA_CLAUSES, OACC_UPDATE_CLAUSES)
	(OACC_KERNELS_COMBINED, OACC_PARALLEL_COMBINED): New macros.
	* tree.h (OMP_CLAUSE_MAP_KIND): Cast it to enum gomp_map_kind.
	(OMP_CLAUSE_SET_MAP_KIND): New macro.
	* varpool.c (varpool_node::get_create): Consider flag_openacc next
	to flag_openmp.
	* config/i386/intelmic-offload.h: New file.
	* config/nvptx/offload.h: Likewise.
	gcc/ada/
	* gcc-interface/utils.c (DEF_FUNCTION_TYPE_VAR_8)
	(DEF_FUNCTION_TYPE_VAR_12): New macros.
	gcc/c-family/
	* c.opt (fopenacc): New option.
	* c-cppbuiltin.c (c_cpp_builtins): Conditionally define _OPENACC.
	* c-common.c (DEF_FUNCTION_TYPE_VAR_8, DEF_FUNCTION_TYPE_VAR_12):
	New macros.
	* c-common.h (c_finish_oacc_wait): New prototype.
	* c-omp.c: Include "omp-low.h" and "gomp-constants.h".
	(c_finish_oacc_wait): New function.
	* c-pragma.c (oacc_pragmas): New variable.
	(c_pp_lookup_pragma, init_pragma): Handle it.
	* c-pragma.h (enum pragma_kind): Add PRAGMA_OACC_CACHE,
	PRAGMA_OACC_DATA, PRAGMA_OACC_ENTER_DATA, PRAGMA_OACC_EXIT_DATA,
	PRAGMA_OACC_KERNELS, PRAGMA_OACC_LOOP, PRAGMA_OACC_PARALLEL,
	PRAGMA_OACC_UPDATE, PRAGMA_OACC_WAIT.
	(enum pragma_omp_clause): Add PRAGMA_OACC_CLAUSE_ASYNC,
	PRAGMA_OACC_CLAUSE_AUTO, PRAGMA_OACC_CLAUSE_COLLAPSE,
	PRAGMA_OACC_CLAUSE_COPY, PRAGMA_OACC_CLAUSE_COPYIN,
	PRAGMA_OACC_CLAUSE_COPYOUT, PRAGMA_OACC_CLAUSE_CREATE,
	PRAGMA_OACC_CLAUSE_DELETE, PRAGMA_OACC_CLAUSE_DEVICE,
	PRAGMA_OACC_CLAUSE_DEVICEPTR, PRAGMA_OACC_CLAUSE_FIRSTPRIVATE,
	PRAGMA_OACC_CLAUSE_GANG, PRAGMA_OACC_CLAUSE_HOST,
	PRAGMA_OACC_CLAUSE_IF, PRAGMA_OACC_CLAUSE_NUM_GANGS,
	PRAGMA_OACC_CLAUSE_NUM_WORKERS, PRAGMA_OACC_CLAUSE_PRESENT,
	PRAGMA_OACC_CLAUSE_PRESENT_OR_COPY,
	PRAGMA_OACC_CLAUSE_PRESENT_OR_COPYIN,
	PRAGMA_OACC_CLAUSE_PRESENT_OR_COPYOUT,
	PRAGMA_OACC_CLAUSE_PRESENT_OR_CREATE, PRAGMA_OACC_CLAUSE_PRIVATE,
	PRAGMA_OACC_CLAUSE_REDUCTION, PRAGMA_OACC_CLAUSE_SELF,
	PRAGMA_OACC_CLAUSE_SEQ, PRAGMA_OACC_CLAUSE_VECTOR,
	PRAGMA_OACC_CLAUSE_VECTOR_LENGTH, PRAGMA_OACC_CLAUSE_WAIT,
	PRAGMA_OACC_CLAUSE_WORKER.
	gcc/c/
	* c-parser.c: Include "gomp-constants.h".
	(c_parser_omp_clause_map): Use enum gomp_map_kind instead of enum
	omp_clause_map_kind.  Use GOMP_MAP_* instead of OMP_CLAUSE_MAP_*.
	Use OMP_CLAUSE_SET_MAP_KIND.
	(c_parser_pragma): Handle PRAGMA_OACC_ENTER_DATA,
	PRAGMA_OACC_EXIT_DATA, PRAGMA_OACC_UPDATE.
	(c_parser_omp_construct): Handle PRAGMA_OACC_CACHE,
	PRAGMA_OACC_DATA, PRAGMA_OACC_KERNELS, PRAGMA_OACC_LOOP,
	PRAGMA_OACC_PARALLEL, PRAGMA_OACC_WAIT.
	(c_parser_omp_clause_name): Handle "auto", "async", "copy",
	"copyout", "create", "delete", "deviceptr", "gang", "host",
	"num_gangs", "num_workers", "present", "present_or_copy", "pcopy",
	"present_or_copyin", "pcopyin", "present_or_copyout", "pcopyout",
	"present_or_create", "pcreate", "seq", "self", "vector",
	"vector_length", "wait", "worker".
	(OACC_DATA_CLAUSE_MASK, OACC_KERNELS_CLAUSE_MASK)
	(OACC_ENTER_DATA_CLAUSE_MASK, OACC_EXIT_DATA_CLAUSE_MASK)
	(OACC_LOOP_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK)
	(OACC_UPDATE_CLAUSE_MASK, OACC_WAIT_CLAUSE_MASK): New macros.
	(c_parser_omp_variable_list): Handle OMP_CLAUSE__CACHE_.
	(c_parser_oacc_wait_list, c_parser_oacc_data_clause)
	(c_parser_oacc_data_clause_deviceptr)
	(c_parser_omp_clause_num_gangs, c_parser_omp_clause_num_workers)
	(c_parser_oacc_clause_async, c_parser_oacc_clause_wait)
	(c_parser_omp_clause_vector_length, c_parser_oacc_all_clauses)
	(c_parser_oacc_cache, c_parser_oacc_data, c_parser_oacc_kernels)
	(c_parser_oacc_enter_exit_data, c_parser_oacc_loop)
	(c_parser_oacc_parallel, c_parser_oacc_update)
	(c_parser_oacc_wait): New functions.
	* c-tree.h (c_finish_oacc_parallel, c_finish_oacc_kernels)
	(c_finish_oacc_data): New prototypes.
	* c-typeck.c: Include "gomp-constants.h".
	(handle_omp_array_sections): Handle GOMP_MAP_FORCE_DEVICEPTR.  Use
	GOMP_MAP_* instead of OMP_CLAUSE_MAP_*.  Use
	OMP_CLAUSE_SET_MAP_KIND.
	(c_finish_oacc_parallel, c_finish_oacc_kernels)
	(c_finish_oacc_data): New functions.
	(c_finish_omp_clauses): Handle OMP_CLAUSE__CACHE_,
	OMP_CLAUSE_NUM_GANGS, OMP_CLAUSE_NUM_WORKERS,
	OMP_CLAUSE_VECTOR_LENGTH, OMP_CLAUSE_ASYNC, OMP_CLAUSE_WAIT,
	OMP_CLAUSE_AUTO, OMP_CLAUSE_SEQ, OMP_CLAUSE_GANG,
	OMP_CLAUSE_WORKER, OMP_CLAUSE_VECTOR, and OMP_CLAUSE_MAP's
	GOMP_MAP_FORCE_DEVICEPTR.
	gcc/cp/
	* parser.c: Include "gomp-constants.h".
	(cp_parser_omp_clause_map): Use enum gomp_map_kind instead of enum
	omp_clause_map_kind.  Use GOMP_MAP_* instead of OMP_CLAUSE_MAP_*.
	Use OMP_CLAUSE_SET_MAP_KIND.
	(cp_parser_omp_construct, cp_parser_pragma): Handle
	PRAGMA_OACC_CACHE, PRAGMA_OACC_DATA, PRAGMA_OACC_ENTER_DATA,
	PRAGMA_OACC_EXIT_DATA, PRAGMA_OACC_KERNELS, PRAGMA_OACC_PARALLEL,
	PRAGMA_OACC_LOOP, PRAGMA_OACC_UPDATE, PRAGMA_OACC_WAIT.
	(cp_parser_omp_clause_name): Handle "async", "copy", "copyout",
	"create", "delete", "deviceptr", "host", "num_gangs",
	"num_workers", "present", "present_or_copy", "pcopy",
	"present_or_copyin", "pcopyin", "present_or_copyout", "pcopyout",
	"present_or_create", "pcreate", "vector_length", "wait".
	(OACC_DATA_CLAUSE_MASK, OACC_ENTER_DATA_CLAUSE_MASK)
	(OACC_EXIT_DATA_CLAUSE_MASK, OACC_KERNELS_CLAUSE_MASK)
	(OACC_LOOP_CLAUSE_MASK, OACC_PARALLEL_CLAUSE_MASK)
	(OACC_UPDATE_CLAUSE_MASK, OACC_WAIT_CLAUSE_MASK): New macros.
	(cp_parser_omp_var_list_no_open): Handle OMP_CLAUSE__CACHE_.
	(cp_parser_oacc_data_clause, cp_parser_oacc_data_clause_deviceptr)
	(cp_parser_oacc_clause_vector_length, cp_parser_oacc_wait_list)
	(cp_parser_oacc_clause_wait, cp_parser_omp_clause_num_gangs)
	(cp_parser_omp_clause_num_workers, cp_parser_oacc_clause_async)
	(cp_parser_oacc_all_clauses, cp_parser_oacc_cache)
	(cp_parser_oacc_data, cp_parser_oacc_enter_exit_data)
	(cp_parser_oacc_kernels, cp_parser_oacc_loop)
	(cp_parser_oacc_parallel, cp_parser_oacc_update)
	(cp_parser_oacc_wait): New functions.
	* cp-tree.h (finish_oacc_data, finish_oacc_kernels)
	(finish_oacc_parallel): New prototypes.
	* semantics.c: Include "gomp-constants.h".
	(handle_omp_array_sections): Handle GOMP_MAP_FORCE_DEVICEPTR.  Use
	GOMP_MAP_* instead of OMP_CLAUSE_MAP_*.  Use
	OMP_CLAUSE_SET_MAP_KIND.
	(finish_omp_clauses): Handle OMP_CLAUSE_ASYNC,
	OMP_CLAUSE_VECTOR_LENGTH, OMP_CLAUSE_WAIT, OMP_CLAUSE__CACHE_.
	Use GOMP_MAP_* instead of OMP_CLAUSE_MAP_*.
	(finish_oacc_data, finish_oacc_kernels, finish_oacc_parallel): New
	functions.
	gcc/fortran/
	* lang.opt (fopenacc): New option.
	* cpp.c (cpp_define_builtins): Conditionally define _OPENACC.
	* dump-parse-tree.c (show_omp_node): Split part of it into...
	(show_omp_clauses): ... this new function.
	(show_omp_node, show_code_node): Handle EXEC_OACC_PARALLEL_LOOP,
	EXEC_OACC_PARALLEL, EXEC_OACC_KERNELS_LOOP, EXEC_OACC_KERNELS,
	EXEC_OACC_DATA, EXEC_OACC_HOST_DATA, EXEC_OACC_LOOP,
	EXEC_OACC_UPDATE, EXEC_OACC_WAIT, EXEC_OACC_CACHE,
	EXEC_OACC_ENTER_DATA, EXEC_OACC_EXIT_DATA.
	(show_namespace): Update for OpenACC.
	* f95-lang.c (DEF_FUNCTION_TYPE_VAR_2, DEF_FUNCTION_TYPE_VAR_8)
	(DEF_FUNCTION_TYPE_VAR_12, DEF_GOACC_BUILTIN)
	(DEF_GOACC_BUILTIN_COMPILER): New macros.
	* types.def (BT_FN_VOID_INT_INT_VAR)
	(BT_FN_VOID_INT_PTR_SIZE_PTR_PTR_PTR_INT_INT_VAR)
	(BT_FN_VOID_INT_OMPFN_PTR_SIZE_PTR_PTR_PTR_INT_INT_INT_INT_INT_VAR):
	New function types.
	* gfortran.h (gfc_statement): Add ST_OACC_PARALLEL_LOOP,
	ST_OACC_END_PARALLEL_LOOP, ST_OACC_PARALLEL, ST_OACC_END_PARALLEL,
	ST_OACC_KERNELS, ST_OACC_END_KERNELS, ST_OACC_DATA,
	ST_OACC_END_DATA, ST_OACC_HOST_DATA, ST_OACC_END_HOST_DATA,
	ST_OACC_LOOP, ST_OACC_END_LOOP, ST_OACC_DECLARE, ST_OACC_UPDATE,
	ST_OACC_WAIT, ST_OACC_CACHE, ST_OACC_KERNELS_LOOP,
	ST_OACC_END_KERNELS_LOOP, ST_OACC_ENTER_DATA, ST_OACC_EXIT_DATA,
	ST_OACC_ROUTINE.
	(struct gfc_expr_list): New data type.
	(gfc_get_expr_list): New macro.
	(gfc_omp_map_op): Add OMP_MAP_FORCE_ALLOC, OMP_MAP_FORCE_DEALLOC,
	OMP_MAP_FORCE_TO, OMP_MAP_FORCE_FROM, OMP_MAP_FORCE_TOFROM,
	OMP_MAP_FORCE_PRESENT, OMP_MAP_FORCE_DEVICEPTR.
	(OMP_LIST_FIRST, OMP_LIST_DEVICE_RESIDENT, OMP_LIST_USE_DEVICE)
	(OMP_LIST_CACHE): New enumerators.
	(struct gfc_omp_clauses): Add async_expr, gang_expr, worker_expr,
	vector_expr, num_gangs_expr, num_workers_expr, vector_length_expr,
	wait_list, tile_list, async, gang, worker, vector, seq,
	independent, wait, par_auto, gang_static, and loc members.
	(struct gfc_namespace): Add oacc_declare_clauses member.
	(gfc_exec_op): Add EXEC_OACC_KERNELS_LOOP,
	EXEC_OACC_PARALLEL_LOOP, EXEC_OACC_PARALLEL, EXEC_OACC_KERNELS,
	EXEC_OACC_DATA, EXEC_OACC_HOST_DATA, EXEC_OACC_LOOP,
	EXEC_OACC_UPDATE, EXEC_OACC_WAIT, EXEC_OACC_CACHE,
	EXEC_OACC_ENTER_DATA, EXEC_OACC_EXIT_DATA.
	(gfc_free_expr_list, gfc_resolve_oacc_directive)
	(gfc_resolve_oacc_declare, gfc_resolve_oacc_parallel_loop_blocks)
	(gfc_resolve_oacc_blocks): New prototypes.
	* match.c (match_exit_cycle): Handle EXEC_OACC_LOOP and
	EXEC_OACC_PARALLEL_LOOP.
	* match.h (gfc_match_oacc_cache, gfc_match_oacc_wait)
	(gfc_match_oacc_update, gfc_match_oacc_declare)
	(gfc_match_oacc_loop, gfc_match_oacc_host_data)
	(gfc_match_oacc_data, gfc_match_oacc_kernels)
	(gfc_match_oacc_kernels_loop, gfc_match_oacc_parallel)
	(gfc_match_oacc_parallel_loop, gfc_match_oacc_enter_data)
	(gfc_match_oacc_exit_data, gfc_match_oacc_routine): New
	prototypes.
	* openmp.c: Include "diagnostic.h" and "gomp-constants.h".
	(gfc_free_omp_clauses): Update for members added to struct
	gfc_omp_clauses.
	(gfc_match_omp_clauses): Change mask paramter to uint64_t.  Add
	openacc parameter.
	(resolve_omp_clauses): Add openacc parameter.  Update for OpenACC.
	(struct fortran_omp_context): Add is_openmp member.
	(gfc_resolve_omp_parallel_blocks): Initialize it.
	(gfc_resolve_do_iterator): Update for OpenACC.
	(gfc_resolve_omp_directive): Call
	resolve_omp_directive_inside_oacc_region.
	(OMP_CLAUSE_PRIVATE, OMP_CLAUSE_FIRSTPRIVATE)
	(OMP_CLAUSE_LASTPRIVATE, OMP_CLAUSE_COPYPRIVATE)
	(OMP_CLAUSE_SHARED, OMP_CLAUSE_COPYIN, OMP_CLAUSE_REDUCTION)
	(OMP_CLAUSE_IF, OMP_CLAUSE_NUM_THREADS, OMP_CLAUSE_SCHEDULE)
	(OMP_CLAUSE_DEFAULT, OMP_CLAUSE_ORDERED, OMP_CLAUSE_COLLAPSE)
	(OMP_CLAUSE_UNTIED, OMP_CLAUSE_FINAL, OMP_CLAUSE_MERGEABLE)
	(OMP_CLAUSE_ALIGNED, OMP_CLAUSE_DEPEND, OMP_CLAUSE_INBRANCH)
	(OMP_CLAUSE_LINEAR, OMP_CLAUSE_NOTINBRANCH, OMP_CLAUSE_PROC_BIND)
	(OMP_CLAUSE_SAFELEN, OMP_CLAUSE_SIMDLEN, OMP_CLAUSE_UNIFORM)
	(OMP_CLAUSE_DEVICE, OMP_CLAUSE_MAP, OMP_CLAUSE_TO)
	(OMP_CLAUSE_FROM, OMP_CLAUSE_NUM_TEAMS, OMP_CLAUSE_THREAD_LIMIT)
	(OMP_CLAUSE_DIST_SCHEDULE): Use uint64_t.
	(OMP_CLAUSE_ASYNC, OMP_CLAUSE_NUM_GANGS, OMP_CLAUSE_NUM_WORKERS)
	(OMP_CLAUSE_VECTOR_LENGTH, OMP_CLAUSE_COPY, OMP_CLAUSE_COPYOUT)
	(OMP_CLAUSE_CREATE, OMP_CLAUSE_PRESENT)
	(OMP_CLAUSE_PRESENT_OR_COPY, OMP_CLAUSE_PRESENT_OR_COPYIN)
	(OMP_CLAUSE_PRESENT_OR_COPYOUT, OMP_CLAUSE_PRESENT_OR_CREATE)
	(OMP_CLAUSE_DEVICEPTR, OMP_CLAUSE_GANG, OMP_CLAUSE_WORKER)
	(OMP_CLAUSE_VECTOR, OMP_CLAUSE_SEQ, OMP_CLAUSE_INDEPENDENT)
	(OMP_CLAUSE_USE_DEVICE, OMP_CLAUSE_DEVICE_RESIDENT)
	(OMP_CLAUSE_HOST_SELF, OMP_CLAUSE_OACC_DEVICE, OMP_CLAUSE_WAIT)
	(OMP_CLAUSE_DELETE, OMP_CLAUSE_AUTO, OMP_CLAUSE_TILE): New macros.
	(gfc_match_omp_clauses): Handle those.
	(OACC_PARALLEL_CLAUSES, OACC_KERNELS_CLAUSES, OACC_DATA_CLAUSES)
	(OACC_LOOP_CLAUSES, OACC_PARALLEL_LOOP_CLAUSES)
	(OACC_KERNELS_LOOP_CLAUSES, OACC_HOST_DATA_CLAUSES)
	(OACC_DECLARE_CLAUSES, OACC_UPDATE_CLAUSES)
	(OACC_ENTER_DATA_CLAUSES, OACC_EXIT_DATA_CLAUSES)
	(OACC_WAIT_CLAUSES): New macros.
	(gfc_free_expr_list, match_oacc_expr_list, match_oacc_clause_gang)
	(gfc_match_omp_map_clause, gfc_match_oacc_parallel_loop)
	(gfc_match_oacc_parallel, gfc_match_oacc_kernels_loop)
	(gfc_match_oacc_kernels, gfc_match_oacc_data)
	(gfc_match_oacc_host_data, gfc_match_oacc_loop)
	(gfc_match_oacc_declare, gfc_match_oacc_update)
	(gfc_match_oacc_enter_data, gfc_match_oacc_exit_data)
	(gfc_match_oacc_wait, gfc_match_oacc_cache)
	(gfc_match_oacc_routine, oacc_is_loop)
	(resolve_oacc_scalar_int_expr, resolve_oacc_positive_int_expr)
	(check_symbol_not_pointer, check_array_not_assumed)
	(resolve_oacc_data_clauses, resolve_oacc_deviceptr_clause)
	(oacc_compatible_clauses, oacc_is_parallel, oacc_is_kernels)
	(omp_code_to_statement, oacc_code_to_statement)
	(resolve_oacc_directive_inside_omp_region)
	(resolve_omp_directive_inside_oacc_region)
	(resolve_oacc_nested_loops, resolve_oacc_params_in_parallel)
	(resolve_oacc_loop_blocks, gfc_resolve_oacc_blocks)
	(resolve_oacc_loop, resolve_oacc_cache, gfc_resolve_oacc_declare)
	(gfc_resolve_oacc_directive): New functions.
	* parse.c (next_free): Update for OpenACC.  Move some code into...
	(verify_token_free): ... this new function.
	(next_fixed): Update for OpenACC.  Move some code into...
	(verify_token_fixed): ... this new function.
	(case_executable): Add ST_OACC_UPDATE, ST_OACC_WAIT,
	ST_OACC_CACHE, ST_OACC_ENTER_DATA, and ST_OACC_EXIT_DATA.
	(case_exec_markers): Add ST_OACC_PARALLEL_LOOP, ST_OACC_PARALLEL,
	ST_OACC_KERNELS, ST_OACC_DATA, ST_OACC_HOST_DATA, ST_OACC_LOOP,
	ST_OACC_KERNELS_LOOP.
	(case_decl): Add ST_OACC_ROUTINE.
	(push_state, parse_critical_block, parse_progunit): Update for
	OpenACC.
	(gfc_ascii_statement): Handle ST_OACC_PARALLEL_LOOP,
	ST_OACC_END_PARALLEL_LOOP, ST_OACC_PARALLEL, ST_OACC_END_PARALLEL,
	ST_OACC_KERNELS, ST_OACC_END_KERNELS, ST_OACC_KERNELS_LOOP,
	ST_OACC_END_KERNELS_LOOP, ST_OACC_DATA, ST_OACC_END_DATA,
	ST_OACC_HOST_DATA, ST_OACC_END_HOST_DATA, ST_OACC_LOOP,
	ST_OACC_END_LOOP, ST_OACC_DECLARE, ST_OACC_UPDATE, ST_OACC_WAIT,
	ST_OACC_CACHE, ST_OACC_ENTER_DATA, ST_OACC_EXIT_DATA,
	ST_OACC_ROUTINE.
	(verify_st_order, parse_spec): Handle ST_OACC_DECLARE.
	(parse_executable): Handle ST_OACC_PARALLEL_LOOP,
	ST_OACC_KERNELS_LOOP, ST_OACC_LOOP, ST_OACC_PARALLEL,
	ST_OACC_KERNELS, ST_OACC_DATA, ST_OACC_HOST_DATA.
	(decode_oacc_directive, parse_oacc_structured_block)
	(parse_oacc_loop, is_oacc): New functions.
	* parse.h (struct gfc_state_data): Add oacc_declare_clauses
	member.
	(is_oacc): New prototype.
	* resolve.c (gfc_resolve_blocks, gfc_resolve_code): Handle
	EXEC_OACC_PARALLEL_LOOP, EXEC_OACC_PARALLEL,
	EXEC_OACC_KERNELS_LOOP, EXEC_OACC_KERNELS, EXEC_OACC_DATA,
	EXEC_OACC_HOST_DATA, EXEC_OACC_LOOP, EXEC_OACC_UPDATE,
	EXEC_OACC_WAIT, EXEC_OACC_CACHE, EXEC_OACC_ENTER_DATA,
	EXEC_OACC_EXIT_DATA.
	(resolve_codes): Call gfc_resolve_oacc_declare.
	* scanner.c (openacc_flag, openacc_locus): New variables.
	(skip_free_comments): Update for OpenACC.  Move some code into...
	(skip_omp_attribute): ... this new function.
	(skip_oacc_attribute): New function.
	(skip_fixed_comments, gfc_next_char_literal): Update for OpenACC.
	* st.c (gfc_free_statement): Handle EXEC_OACC_PARALLEL_LOOP,
	EXEC_OACC_PARALLEL, EXEC_OACC_KERNELS_LOOP, EXEC_OACC_KERNELS,
	EXEC_OACC_DATA, EXEC_OACC_HOST_DATA, EXEC_OACC_LOOP,
	EXEC_OACC_UPDATE, EXEC_OACC_WAIT, EXEC_OACC_CACHE,
	EXEC_OACC_ENTER_DATA, EXEC_OACC_EXIT_DATA.
	* trans-decl.c (gfc_generate_function_code): Update for OpenACC.
	* trans-openmp.c: Include "gomp-constants.h".
	(gfc_omp_finish_clause, gfc_trans_omp_clauses): Use GOMP_MAP_*
	instead of OMP_CLAUSE_MAP_*.  Use OMP_CLAUSE_SET_MAP_KIND.
	(gfc_trans_omp_clauses): Handle OMP_LIST_USE_DEVICE,
	OMP_LIST_DEVICE_RESIDENT, OMP_LIST_CACHE, and OMP_MAP_FORCE_ALLOC,
	OMP_MAP_FORCE_DEALLOC, OMP_MAP_FORCE_TO, OMP_MAP_FORCE_FROM,
	OMP_MAP_FORCE_TOFROM, OMP_MAP_FORCE_PRESENT,
	OMP_MAP_FORCE_DEVICEPTR, and gfc_omp_clauses' async, seq,
	independent, wait_list, num_gangs_expr, num_workers_expr,
	vector_length_expr, vector, vector_expr, worker, worker_expr,
	gang, gang_expr members.
	(gfc_trans_omp_do): Handle EXEC_OACC_LOOP.
	(gfc_convert_expr_to_tree, gfc_trans_oacc_construct)
	(gfc_trans_oacc_executable_directive)
	(gfc_trans_oacc_wait_directive, gfc_trans_oacc_combined_directive)
	(gfc_trans_oacc_declare, gfc_trans_oacc_directive): New functions.
	* trans-stmt.c (gfc_trans_block_construct): Update for OpenACC.
	* trans-stmt.h (gfc_trans_oacc_directive, gfc_trans_oacc_declare):
	New prototypes.
	* trans.c (tranc_code): Handle EXEC_OACC_CACHE, EXEC_OACC_WAIT,
	EXEC_OACC_UPDATE, EXEC_OACC_LOOP, EXEC_OACC_HOST_DATA,
	EXEC_OACC_DATA, EXEC_OACC_KERNELS, EXEC_OACC_KERNELS_LOOP,
	EXEC_OACC_PARALLEL, EXEC_OACC_PARALLEL_LOOP, EXEC_OACC_ENTER_DATA,
	EXEC_OACC_EXIT_DATA.
	* gfortran.texi: Update for OpenACC.
	* intrinsic.texi: Likewise.
	* invoke.texi: Likewise.
	gcc/lto/
	* lto-lang.c (DEF_FUNCTION_TYPE_VAR_8, DEF_FUNCTION_TYPE_VAR_12):
	New macros.
	* lto.c: Include "gomp-constants.h".
	gcc/testsuite/
	* lib/target-supports.exp (check_effective_target_fopenacc): New
	procedure.
	* g++.dg/goacc-gomp/goacc-gomp.exp: New file.
	* g++.dg/goacc/goacc.exp: Likewise.
	* gcc.dg/goacc-gomp/goacc-gomp.exp: Likewise.
	* gcc.dg/goacc/goacc.exp: Likewise.
	* gfortran.dg/goacc/goacc.exp: Likewise.
	* c-c++-common/cpp/openacc-define-1.c: New file.
	* c-c++-common/cpp/openacc-define-2.c: Likewise.
	* c-c++-common/cpp/openacc-define-3.c: Likewise.
	* c-c++-common/goacc-gomp/nesting-1.c: Likewise.
	* c-c++-common/goacc-gomp/nesting-fail-1.c: Likewise.
	* c-c++-common/goacc/acc_on_device-2-off.c: Likewise.
	* c-c++-common/goacc/acc_on_device-2.c: Likewise.
	* c-c++-common/goacc/asyncwait-1.c: Likewise.
	* c-c++-common/goacc/cache-1.c: Likewise.
	* c-c++-common/goacc/clauses-fail.c: Likewise.
	* c-c++-common/goacc/collapse-1.c: Likewise.
	* c-c++-common/goacc/data-1.c: Likewise.
	* c-c++-common/goacc/data-2.c: Likewise.
	* c-c++-common/goacc/data-clause-duplicate-1.c: Likewise.
	* c-c++-common/goacc/deviceptr-1.c: Likewise.
	* c-c++-common/goacc/deviceptr-2.c: Likewise.
	* c-c++-common/goacc/deviceptr-3.c: Likewise.
	* c-c++-common/goacc/if-clause-1.c: Likewise.
	* c-c++-common/goacc/if-clause-2.c: Likewise.
	* c-c++-common/goacc/kernels-1.c: Likewise.
	* c-c++-common/goacc/loop-1.c: Likewise.
	* c-c++-common/goacc/loop-private-1.c: Likewise.
	* c-c++-common/goacc/nesting-1.c: Likewise.
	* c-c++-common/goacc/nesting-data-1.c: Likewise.
	* c-c++-common/goacc/nesting-fail-1.c: Likewise.
	* c-c++-common/goacc/parallel-1.c: Likewise.
	* c-c++-common/goacc/pcopy.c: Likewise.
	* c-c++-common/goacc/pcopyin.c: Likewise.
	* c-c++-common/goacc/pcopyout.c: Likewise.
	* c-c++-common/goacc/pcreate.c: Likewise.
	* c-c++-common/goacc/pragma_context.c: Likewise.
	* c-c++-common/goacc/present-1.c: Likewise.
	* c-c++-common/goacc/reduction-1.c: Likewise.
	* c-c++-common/goacc/reduction-2.c: Likewise.
	* c-c++-common/goacc/reduction-3.c: Likewise.
	* c-c++-common/goacc/reduction-4.c: Likewise.
	* c-c++-common/goacc/sb-1.c: Likewise.
	* c-c++-common/goacc/sb-2.c: Likewise.
	* c-c++-common/goacc/sb-3.c: Likewise.
	* c-c++-common/goacc/update-1.c: Likewise.
	* gcc.dg/goacc/acc_on_device-1.c: Likewise.
	* gfortran.dg/goacc/acc_on_device-1.f95: Likewise.
	* gfortran.dg/goacc/acc_on_device-2-off.f95: Likewise.
	* gfortran.dg/goacc/acc_on_device-2.f95: Likewise.
	* gfortran.dg/goacc/assumed.f95: Likewise.
	* gfortran.dg/goacc/asyncwait-1.f95: Likewise.
	* gfortran.dg/goacc/asyncwait-2.f95: Likewise.
	* gfortran.dg/goacc/asyncwait-3.f95: Likewise.
	* gfortran.dg/goacc/asyncwait-4.f95: Likewise.
	* gfortran.dg/goacc/branch.f95: Likewise.
	* gfortran.dg/goacc/cache-1.f95: Likewise.
	* gfortran.dg/goacc/coarray.f95: Likewise.
	* gfortran.dg/goacc/continuation-free-form.f95: Likewise.
	* gfortran.dg/goacc/cray.f95: Likewise.
	* gfortran.dg/goacc/critical.f95: Likewise.
	* gfortran.dg/goacc/data-clauses.f95: Likewise.
	* gfortran.dg/goacc/data-tree.f95: Likewise.
	* gfortran.dg/goacc/declare-1.f95: Likewise.
	* gfortran.dg/goacc/enter-exit-data.f95: Likewise.
	* gfortran.dg/goacc/fixed-1.f: Likewise.
	* gfortran.dg/goacc/fixed-2.f: Likewise.
	* gfortran.dg/goacc/fixed-3.f: Likewise.
	* gfortran.dg/goacc/fixed-4.f: Likewise.
	* gfortran.dg/goacc/host_data-tree.f95: Likewise.
	* gfortran.dg/goacc/if.f95: Likewise.
	* gfortran.dg/goacc/kernels-tree.f95: Likewise.
	* gfortran.dg/goacc/list.f95: Likewise.
	* gfortran.dg/goacc/literal.f95: Likewise.
	* gfortran.dg/goacc/loop-1.f95: Likewise.
	* gfortran.dg/goacc/loop-2.f95: Likewise.
	* gfortran.dg/goacc/loop-3.f95: Likewise.
	* gfortran.dg/goacc/loop-tree-1.f90: Likewise.
	* gfortran.dg/goacc/omp.f95: Likewise.
	* gfortran.dg/goacc/parallel-kernels-clauses.f95: Likewise.
	* gfortran.dg/goacc/parallel-kernels-regions.f95: Likewise.
	* gfortran.dg/goacc/parallel-tree.f95: Likewise.
	* gfortran.dg/goacc/parameter.f95: Likewise.
	* gfortran.dg/goacc/private-1.f95: Likewise.
	* gfortran.dg/goacc/private-2.f95: Likewise.
	* gfortran.dg/goacc/private-3.f95: Likewise.
	* gfortran.dg/goacc/pure-elemental-procedures.f95: Likewise.
	* gfortran.dg/goacc/reduction-2.f95: Likewise.
	* gfortran.dg/goacc/reduction.f95: Likewise.
	* gfortran.dg/goacc/routine-1.f90: Likewise.
	* gfortran.dg/goacc/routine-2.f90: Likewise.
	* gfortran.dg/goacc/sentinel-free-form.f95: Likewise.
	* gfortran.dg/goacc/several-directives.f95: Likewise.
	* gfortran.dg/goacc/sie.f95: Likewise.
	* gfortran.dg/goacc/subarrays.f95: Likewise.
	* gfortran.dg/gomp/map-1.f90: Likewise.
	* gfortran.dg/openacc-define-1.f90: Likewise.
	* gfortran.dg/openacc-define-2.f90: Likewise.
	* gfortran.dg/openacc-define-3.f90: Likewise.
	* g++.dg/gomp/block-1.C: Update for changed compiler output.
	* g++.dg/gomp/block-2.C: Likewise.
	* g++.dg/gomp/block-3.C: Likewise.
	* g++.dg/gomp/block-5.C: Likewise.
	* g++.dg/gomp/target-1.C: Likewise.
	* g++.dg/gomp/target-2.C: Likewise.
	* g++.dg/gomp/taskgroup-1.C: Likewise.
	* g++.dg/gomp/teams-1.C: Likewise.
	* gcc.dg/cilk-plus/jump-openmp.c: Likewise.
	* gcc.dg/cilk-plus/jump.c: Likewise.
	* gcc.dg/gomp/block-1.c: Likewise.
	* gcc.dg/gomp/block-10.c: Likewise.
	* gcc.dg/gomp/block-2.c: Likewise.
	* gcc.dg/gomp/block-3.c: Likewise.
	* gcc.dg/gomp/block-4.c: Likewise.
	* gcc.dg/gomp/block-5.c: Likewise.
	* gcc.dg/gomp/block-6.c: Likewise.
	* gcc.dg/gomp/block-7.c: Likewise.
	* gcc.dg/gomp/block-8.c: Likewise.
	* gcc.dg/gomp/block-9.c: Likewise.
	* gcc.dg/gomp/target-1.c: Likewise.
	* gcc.dg/gomp/target-2.c: Likewise.
	* gcc.dg/gomp/taskgroup-1.c: Likewise.
	* gcc.dg/gomp/teams-1.c: Likewise.
	include/
	* gomp-constants.h: New file.
	libgomp/
	* Makefile.am (search_path): Add $(top_srcdir)/../include.
	(libgomp_la_SOURCES): Add splay-tree.c, libgomp-plugin.c,
	oacc-parallel.c, oacc-host.c, oacc-init.c, oacc-mem.c,
	oacc-async.c, oacc-plugin.c, oacc-cuda.c.
	[USE_FORTRAN] (libgomp_la_SOURCES): Add openacc.f90.
	Include $(top_srcdir)/plugin/Makefrag.am.
	(nodist_libsubinclude_HEADERS): Add openacc.h.
	[USE_FORTRAN] (nodist_finclude_HEADERS): Add openacc_lib.h,
	openacc.f90, openacc.mod, openacc_kinds.mod.
	(omp_lib.mod): Generalize into...
	(%.mod): ... this new rule.
	(openacc_kinds.mod, openacc.mod): New rules.
	* plugin/configfrag.ac: New file.
	* configure.ac: Move plugin/offloading support into it.  Include
	it.  Instantiate testsuite/libgomp-test-support.pt.exp.
	* plugin/Makefrag.am: New file.
	* testsuite/Makefile.am (OFFLOAD_TARGETS)
	(OFFLOAD_ADDITIONAL_OPTIONS, OFFLOAD_ADDITIONAL_LIB_PATHS): Don't
	export.
	(libgomp-test-support.exp): New rule.
	(all-local): Depend on it.
	* Makefile.in: Regenerate.
	* testsuite/Makefile.in: Regenerate.
	* config.h.in: Likewise.
	* configure: Likewise.
	* configure.tgt: Harden shell syntax.
	* env.c: Include "oacc-int.h".
	(parse_acc_device_type): New function.
	(gomp_debug_var, goacc_device_type, goacc_device_num): New
	variables.
	(initialize_env): Initialize those.  Call
	goacc_runtime_initialize.
	* error.c (gomp_vdebug, gomp_debug, gomp_vfatal): New functions.
	(gomp_fatal): Call gomp_vfatal.
	* libgomp.h: Include "libgomp-plugin.h" and <stdarg.h>.
	(gomp_debug_var, goacc_device_type, goacc_device_num, gomp_vdebug)
	(gomp_debug, gomp_verror, gomp_vfatal, gomp_init_targets_once)
	(splay_tree_node, splay_tree, splay_tree_key)
	(struct target_mem_desc, struct splay_tree_key_s)
	(struct gomp_memory_mapping, struct acc_dispatch_t)
	(struct gomp_device_descr, gomp_acc_insert_pointer)
	(gomp_acc_remove_pointer, target_mem_desc, gomp_copy_from_async)
	(gomp_unmap_vars, gomp_init_device, gomp_init_tables)
	(gomp_free_memmap, gomp_fini_device): New declarations.
	(gomp_vdebug, gomp_debug): New macros.
	Include "splay-tree.h".
	* libgomp.map (OACC_2.0): New symbol version.  Use for
	acc_get_num_devices, acc_get_num_devices_h_, acc_set_device_type,
	acc_set_device_type_h_, acc_get_device_type,
	acc_get_device_type_h_, acc_set_device_num, acc_set_device_num_h_,
	acc_get_device_num, acc_get_device_num_h_, acc_async_test,
	acc_async_test_h_, acc_async_test_all, acc_async_test_all_h_,
	acc_wait, acc_wait_h_, acc_wait_async, acc_wait_async_h_,
	acc_wait_all, acc_wait_all_h_, acc_wait_all_async,
	acc_wait_all_async_h_, acc_init, acc_init_h_, acc_shutdown,
	acc_shutdown_h_, acc_on_device, acc_on_device_h_, acc_malloc,
	acc_free, acc_copyin, acc_copyin_32_h_, acc_copyin_64_h_,
	acc_copyin_array_h_, acc_present_or_copyin,
	acc_present_or_copyin_32_h_, acc_present_or_copyin_64_h_,
	acc_present_or_copyin_array_h_, acc_create, acc_create_32_h_,
	acc_create_64_h_, acc_create_array_h_, acc_present_or_create,
	acc_present_or_create_32_h_, acc_present_or_create_64_h_,
	acc_present_or_create_array_h_, acc_copyout, acc_copyout_32_h_,
	acc_copyout_64_h_, acc_copyout_array_h_, acc_delete,
	acc_delete_32_h_, acc_delete_64_h_, acc_delete_array_h_,
	acc_update_device, acc_update_device_32_h_,
	acc_update_device_64_h_, acc_update_device_array_h_,
	acc_update_self, acc_update_self_32_h_, acc_update_self_64_h_,
	acc_update_self_array_h_, acc_map_data, acc_unmap_data,
	acc_deviceptr, acc_hostptr, acc_is_present, acc_is_present_32_h_,
	acc_is_present_64_h_, acc_is_present_array_h_,
	acc_memcpy_to_device, acc_memcpy_from_device,
	acc_get_current_cuda_device, acc_get_current_cuda_context,
	acc_get_cuda_stream, acc_set_cuda_stream.
	(GOACC_2.0): New symbol version.  Use for GOACC_data_end,
	GOACC_data_start, GOACC_enter_exit_data, GOACC_parallel,
	GOACC_update, GOACC_wait, GOACC_get_thread_num,
	GOACC_get_num_threads.
	(GOMP_PLUGIN_1.0): New symbol version.  Use for
	GOMP_PLUGIN_malloc, GOMP_PLUGIN_malloc_cleared,
	GOMP_PLUGIN_realloc, GOMP_PLUGIN_debug, GOMP_PLUGIN_error,
	GOMP_PLUGIN_fatal, GOMP_PLUGIN_async_unmap_vars,
	GOMP_PLUGIN_acc_thread.
	* libgomp.texi: Update for OpenACC changes, and GOMP_DEBUG
	environment variable.
	* libgomp_g.h (GOACC_data_start, GOACC_data_end)
	(GOACC_enter_exit_data, GOACC_parallel, GOACC_update, GOACC_wait)
	(GOACC_get_num_threads, GOACC_get_thread_num): New declarations.
	* splay-tree.h (splay_tree_lookup, splay_tree_insert)
	(splay_tree_remove): New declarations.
	(rotate_left, rotate_right, splay_tree_splay, splay_tree_insert)
	(splay_tree_remove, splay_tree_lookup): Move into...
	* splay-tree.c: ... this new file.
	* target.c: Include "oacc-plugin.h", "oacc-int.h", <assert.h>.
	(splay_tree_node, splay_tree, splay_tree_key)
	(struct target_mem_desc, struct splay_tree_key_s)
	(struct gomp_device_descr): Don't declare.
	(num_devices_openmp): New variable.
	(gomp_get_num_devices ): Use it.
	(gomp_init_targets_once): New function.
	(gomp_get_num_devices ): Use it.
	(get_kind, gomp_copy_from_async, gomp_free_memmap)
	(gomp_fini_device, gomp_register_image_for_device): New functions.
	(gomp_map_vars): Add devaddrs parameter.
	(gomp_update): Add mm parameter.
	(gomp_init_device): Move most of it into...
	(gomp_init_tables): ... this new function.
	(gomp_register_images_for_device): Remove function.
	(splay_compare, gomp_map_vars, gomp_unmap_vars, gomp_init_device):
	Make them hidden instead of static.
	(gomp_map_vars_existing, gomp_map_vars, gomp_unmap_vars)
	(gomp_update, gomp_init_device, GOMP_target, GOMP_target_data)
	(GOMP_target_end_data, GOMP_target_update)
	(gomp_load_plugin_for_device, gomp_target_init): Update for
	OpenACC changes.
	* oacc-async.c: New file.
	* oacc-cuda.c: Likewise.
	* oacc-host.c: Likewise.
	* oacc-init.c: Likewise.
	* oacc-int.h: Likewise.
	* oacc-mem.c: Likewise.
	* oacc-parallel.c: Likewise.
	* oacc-plugin.c: Likewise.
	* oacc-plugin.h: Likewise.
	* oacc-ptx.h: Likewise.
	* openacc.f90: Likewise.
	* openacc.h: Likewise.
	* openacc_lib.h: Likewise.
	* plugin/plugin-host.c: Likewise.
	* plugin/plugin-nvptx.c: Likewise.
	* libgomp-plugin.c: Likewise.
	* libgomp-plugin.h: Likewise.
	* libgomp_target.h: Remove file after merging content into the
	former file.  Update all users.
	* testsuite/lib/libgomp.exp: Load libgomp-test-support.exp.
	(offload_targets_s, offload_targets_s_openacc): New variables.
	(check_effective_target_openacc_nvidia_accel_present)
	(check_effective_target_openacc_nvidia_accel_selected): New
	procedures.
	(libgomp_init): Update for OpenACC changes.
	* testsuite/libgomp-test-support.exp.in: New file.
	* testsuite/libgomp.oacc-c++/c++.exp: Likewise.
	* testsuite/libgomp.oacc-c/c.exp: Likewise.
	* testsuite/libgomp.oacc-fortran/fortran.exp: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/abort-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/abort-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/abort-3.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/abort-4.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/acc_on_device-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/asyncwait-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/cache-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/clauses-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/clauses-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/collapse-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/collapse-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/collapse-3.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/collapse-4.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/context-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/context-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/context-3.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/context-4.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-3.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-already-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-already-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-already-3.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-already-4.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-already-5.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-already-6.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-already-7.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/data-already-8.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/deviceptr-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/if-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/kernels-empty.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-10.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-11.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-12.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-13.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-14.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-15.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-16.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-17.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-18.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-19.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-20.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-21.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-22.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-23.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-24.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-25.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-26.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-27.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-28.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-29.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-3.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-30.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-31.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-32.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-33.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-34.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-35.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-36.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-37.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-38.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-39.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-4.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-40.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-41.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-42.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-43.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-44.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-45.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-46.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-47.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-48.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-49.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-5.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-50.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-51.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-52.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-53.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-54.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-55.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-56.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-57.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-58.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-59.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-6.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-60.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-61.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-62.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-63.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-64.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-65.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-66.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-67.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-68.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-69.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-7.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-70.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-71.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-72.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-73.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-74.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-75.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-76.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-77.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-78.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-79.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-80.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-81.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-82.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-83.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-84.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-85.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-86.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-87.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-88.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-89.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-9.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-90.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-91.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/lib-92.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/nested-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/nested-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/offset-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/parallel-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/parallel-empty.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/pointer-align-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/present-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/present-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/reduction-1.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/reduction-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/reduction-3.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/reduction-4.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/reduction-5.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/reduction-initial-1.c:
	Likewise.
	* testsuite/libgomp.oacc-c-c++-common/subr.h: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/subr.ptx: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/timer.h: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/update-1-2.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/update-1.c: Likewise.
	* testsuite/libgomp.oacc-fortran/abort-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/abort-2.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/acc_on_device-1-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/acc_on_device-1-2.f: Likewise.
	* testsuite/libgomp.oacc-fortran/acc_on_device-1-3.f: Likewise.
	* testsuite/libgomp.oacc-fortran/asyncwait-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/asyncwait-2.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/asyncwait-3.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/collapse-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/collapse-2.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/collapse-3.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/collapse-4.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/collapse-5.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/collapse-6.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/collapse-7.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/collapse-8.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/data-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/data-2.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/data-3.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/data-4-2.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/data-4.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/data-already-1.f: Likewise.
	* testsuite/libgomp.oacc-fortran/data-already-2.f: Likewise.
	* testsuite/libgomp.oacc-fortran/data-already-3.f: Likewise.
	* testsuite/libgomp.oacc-fortran/data-already-4.f: Likewise.
	* testsuite/libgomp.oacc-fortran/data-already-5.f: Likewise.
	* testsuite/libgomp.oacc-fortran/data-already-6.f: Likewise.
	* testsuite/libgomp.oacc-fortran/data-already-7.f: Likewise.
	* testsuite/libgomp.oacc-fortran/data-already-8.f: Likewise.
	* testsuite/libgomp.oacc-fortran/lib-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/lib-10.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/lib-2.f: Likewise.
	* testsuite/libgomp.oacc-fortran/lib-3.f: Likewise.
	* testsuite/libgomp.oacc-fortran/lib-4.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/lib-5.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/lib-6.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/lib-7.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/lib-8.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/map-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/openacc_version-1.f: Likewise.
	* testsuite/libgomp.oacc-fortran/openacc_version-2.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/pointer-align-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/pset-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/reduction-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/reduction-2.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/reduction-3.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/reduction-4.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/reduction-5.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/reduction-6.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/routine-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/routine-2.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/routine-3.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/routine-4.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/subarrays-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/subarrays-2.f90: Likewise.
	liboffloadmic/
	* plugin/libgomp-plugin-intelmic.cpp (GOMP_OFFLOAD_get_name)
	(GOMP_OFFLOAD_get_caps, GOMP_OFFLOAD_fini_device): New functions.

Co-Authored-By: Bernd Schmidt <bernds@codesourcery.com>
Co-Authored-By: Cesar Philippidis <cesar@codesourcery.com>
Co-Authored-By: Dmitry Bocharnikov <dmitry.b@samsung.com>
Co-Authored-By: Evgeny Gavrin <e.gavrin@samsung.com>
Co-Authored-By: Ilmir Usmanov <i.usmanov@samsung.com>
Co-Authored-By: Jakub Jelinek <jakub@redhat.com>
Co-Authored-By: James Norris <jnorris@codesourcery.com>
Co-Authored-By: Julian Brown <julian@codesourcery.com>
Co-Authored-By: Nathan Sidwell <nathan@codesourcery.com>
Co-Authored-By: Tobias Burnus <burnus@net-b.de>
Co-Authored-By: Tom de Vries <tom@codesourcery.com>

From-SVN: r219682
2015-01-15 21:11:12 +01:00