gcc/include
Tobias Burnus 9967206d41 [og13] OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rect
This is a version of Tobias's mainline patch of the same name,
merged to og13 and with the followup patch "libgomp: cuda.h and
omp_target_memcpy_rect cleanup" folded in.  A couple of merge conflicts
have also been resolved, mostly regarding "gomp_update".  Tobias's
original log message follows.

When copying a 2D or 3D rectangular memmory block, the performance is
better when using CUDA's cuMemcpy2D/cuMemcpy3D instead of copying the
data one by one. That's what this commit does.

Additionally, it permits device-to-device copies, if necessary using a
temporary variable on the host.

2023-09-19  Tobias Burnus  <tobias@codesourcery.com>
	    Julian Brown  <julian@codesourcery.com>

include/
	* cuda/cuda.h (CUlimit): Add CUDA_ERROR_NOT_INITIALIZED,
	CUDA_ERROR_DEINITIALIZED, CUDA_ERROR_INVALID_HANDLE.
	(CUarray, CUmemorytype, CUDA_MEMCPY2D, CUDA_MEMCPY3D,
	CUDA_MEMCPY3D_PEER): New typdefs.
	(cuMemcpyPeer, cuMemcpyPeerAsync, cuMemcpy2D, cuMemcpy2DAsync,
	cuMemcpy2DUnaligned, cuMemcpy3D, cuMemcpy3DAsync, cuMemcpy3DPeer,
	cuMemcpy3DPeerAsync): New prototypes.

libgomp/
	* libgomp-plugin.h (GOMP_OFFLOAD_memcpy2d,
	GOMP_OFFLOAD_memcpy3d): New prototypes.
	* libgomp.h (struct gomp_device_descr): Add memcpy2d_func
	and memcpy3d_func.
	* libgomp.texi (nvptx): Document when cuMemcpy2D/cuMemcpy3D is used.
	* oacc-host.c (memcpy2d_func, .memcpy3d_func): Init with NULL.
	* plugin/cuda-lib.def (cuMemcpy2D, cuMemcpy2DUnaligned,
	cuMemcpy3D): Invoke via CUDA_ONE_CALL.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_memcpy2d,
	GOMP_OFFLOAD_memcpy3d): New.
	* target.c (omp_target_memcpy_rect_worker): Update prototype.
	(omp_target_memcpy_rect_check, omp_target_memcpy_rect_copy):
	Permit all device-to-device copies; invoke new plugins for
	2D and 3D copying when available.
	(gomp_update): Update calls to omp_target_memcpy_rect_worker.  Ensure
	that tmp space is not allocated here.
	(gomp_load_plugin_for_device): DLSYM the new plugin functions.
	* testsuite/libgomp.c/target-12.c: Fix dimension bug.
	* testsuite/libgomp.fortran/target-12.f90: Likewise.
	* testsuite/libgomp.fortran/target-memcpy-rect-1.f90: New test.
2023-09-20 11:18:51 +00:00
..
cuda [og13] OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rect 2023-09-20 11:18:51 +00:00
gdb Update copyright years. 2023-01-16 11:52:17 +01:00
COPYING
COPYING3
ChangeLog Update ChangeLog and version files for release 2023-07-27 08:13:36 +00:00
ChangeLog-9103
ChangeLog.jit
ChangeLog.omp [og13] OpenMP: Call cuMemcpy2D/cuMemcpy3D for nvptx for omp_target_memcpy_rect 2023-09-20 11:18:51 +00:00
ansidecl.h Update copyright years. 2023-01-16 11:52:17 +01:00
btf.h Update copyright years. 2023-01-16 11:52:17 +01:00
ctf.h Update copyright years. 2023-01-16 11:52:17 +01:00
demangle.h Update copyright years. 2023-01-16 11:52:17 +01:00
dwarf2.def Update copyright years. 2023-01-16 11:52:17 +01:00
dwarf2.h Update copyright years. 2023-01-16 11:52:17 +01:00
dyn-string.h Update copyright years. 2023-01-16 11:52:17 +01:00
environ.h Update copyright years. 2023-01-16 11:52:17 +01:00
fibheap.h Update copyright years. 2023-01-16 11:52:17 +01:00
filenames.h Update copyright years. 2023-01-16 11:52:17 +01:00
floatformat.h Update copyright years. 2023-01-16 11:52:17 +01:00
fnmatch.h Update copyright years. 2023-01-16 11:52:17 +01:00
gcc-c-fe.def Update copyright years. 2023-01-16 11:52:17 +01:00
gcc-c-interface.h Update copyright years. 2023-01-16 11:52:17 +01:00
gcc-cp-fe.def Update copyright years. 2023-01-16 11:52:17 +01:00
gcc-cp-interface.h Update copyright years. 2023-01-16 11:52:17 +01:00
gcc-interface.h Update copyright years. 2023-01-16 11:52:17 +01:00
getopt.h Update copyright years. 2023-01-16 11:52:17 +01:00
gomp-constants.h libgomp: parallel reverse offload 2023-09-12 15:06:31 +01:00
hashtab.h Update copyright years. 2023-01-16 11:52:17 +01:00
hsa.h Import HSA header files from AMD 2020-12-09 11:10:40 +00:00
hsa_ext_amd.h Import HSA header files from AMD 2020-12-09 11:10:40 +00:00
hsa_ext_image.h Import HSA header files from AMD 2020-12-09 11:10:40 +00:00
leb128.h Update copyright years. 2023-01-16 11:52:17 +01:00
libiberty.h Update copyright years. 2023-01-16 11:52:17 +01:00
longlong.h Update copyright years. 2023-01-16 11:52:17 +01:00
lto-symtab.h Update copyright years. 2023-01-16 11:52:17 +01:00
md5.h Update copyright years. 2023-01-16 11:52:17 +01:00
objalloc.h Update copyright years. 2023-01-16 11:52:17 +01:00
obstack.h Update copyright years. 2023-01-16 11:52:17 +01:00
partition.h Update copyright years. 2023-01-16 11:52:17 +01:00
plugin-api.h Implement LDPT_REGISTER_CLAIM_FILE_HOOK_V2 linker plugin hook [PR109128] 2023-05-19 16:37:17 +01:00
safe-ctype.h Update copyright years. 2023-01-16 11:52:17 +01:00
sha1.h Update copyright years. 2023-01-16 11:52:17 +01:00
simple-object.h Update copyright years. 2023-01-16 11:52:17 +01:00
sort.h Update copyright years. 2023-01-16 11:52:17 +01:00
splay-tree.h Update copyright years. 2023-01-16 11:52:17 +01:00
symcat.h Update copyright years. 2023-01-16 11:52:17 +01:00
timeval-utils.h Update copyright years. 2023-01-16 11:52:17 +01:00
vtv-change-permission.h Update copyright years. 2023-01-16 11:52:17 +01:00
xregex.h
xregex2.h Update copyright years. 2023-01-16 11:52:17 +01:00
xtensa-config.h Update copyright years. 2023-01-16 11:52:17 +01:00
xtensa-dynconfig.h gcc: xtensa: add XCHAL_HAVE_{CLAMPS,DEPBITS,EXCLUSIVE,XEA3} to dynconfig 2023-02-27 04:03:33 -08:00