Commit Graph

4 Commits

Author SHA1 Message Date
Cesar Philippidis bd9b3d3d1a [nvptx] Use CUDA driver API to select default runtime launch geometry
The CUDA driver API starting version 6.5 offers a set of runtime functions to
calculate several occupancy-related measures, as a replacement for the occupancy
calculator spreadsheet.

This patch adds a heuristic for default runtime launch geometry, based on the
new runtime function cuOccupancyMaxPotentialBlockSize.

Build on x86_64 with nvptx accelerator and ran libgomp testsuite.

2018-08-13  Cesar Philippidis  <cesar@codesourcery.com>
	    Tom de Vries  <tdevries@suse.de>

	PR target/85590
	* plugin/cuda/cuda.h (CUoccupancyB2DSize): New typedef.
	(cuOccupancyMaxPotentialBlockSize): Declare.
	* plugin/cuda-lib.def (cuOccupancyMaxPotentialBlockSize): New
	CUDA_ONE_CALL_MAYBE_NULL.
	* plugin/plugin-nvptx.c (CUDA_VERSION < 6050): Define
	CUoccupancyB2DSize and declare
	cuOccupancyMaxPotentialBlockSize.
	(nvptx_exec): Use cuOccupancyMaxPotentialBlockSize to set the
	default num_gangs and num_workers when the driver supports it.

Co-Authored-By: Tom de Vries <tdevries@suse.de>

From-SVN: r263505
2018-08-13 12:04:24 +00:00
Tom de Vries 8e09a12f01 [libgomp, nvptx] Fall back to cuLinkAddData/cuLinkCreate if _v2 not found
Cuda driver api functions cuLinkAddData and cuLinkCreate are available starting
version 5.5.  In version 6.5, they are remapped onto _v2 versions.

The dlopen interface of the libgomp nvptx plugin uses the _v2 versions, so it
won't work with a cuda driver with driver api version lower than 6.5.

This patch fixes the problem by testing for the presence of the _v2 versions,
and falling back to the original versions in case of absence of the _v2
versions.

Build on x86_64 with nvptx accelerator and reg-tested libgomp, both with and
without --without-cuda-driver.

2018-08-08  Tom de Vries  <tdevries@suse.de>

	* plugin/cuda-lib.def (cuLinkAddData_v2, cuLinkCreate_v2): Declare using
	CUDA_ONE_CALL_MAYBE_NULL.
	* plugin/plugin-nvptx.c (cuLinkAddData, cuLinkCreate): Undef and declare.
	(cuLinkAddData_v2, cuLinkCreate_v2): Declare.
	(link_ptx): Fall back to cuLinkAddData/cuLinkCreate if the _v2 versions
	are not found.

From-SVN: r263408
2018-08-08 14:26:37 +00:00
Tom de Vries cedd9bd016 [libgomp, nvptx] Allow cuGetErrorString to be NULL
Cuda driver api function cuGetErrorString is available in version 6.0 and
higher.

Currently, when the driver that is used does not contain this function, the
libgomp nvptx plugin will not build (PLUGIN_NVPTX_DYNAMIC == 0) or run
(PLUGIN_NVPTX_DYNAMIC == 1).

This patch fixes this problem by testing for the presence of the function, and
handling absence.

Build on x86_64 with nvptx accelerator and reg-tested libgomp, both with and
without --without-cuda-driver.

2018-08-08  Tom de Vries  <tdevries@suse.de>

	* plugin/cuda-lib.def (cuGetErrorString): Use CUDA_ONE_CALL_MAYBE_NULL.
	* plugin/plugin-nvptx.c (cuda_error): Handle if cuGetErrorString is not
	present.

From-SVN: r263407
2018-08-08 14:26:28 +00:00
Tom de Vries 8c6310a2c2 [libgomp, nvptx] Add cuda-lib.def
2018-08-01  Tom de Vries  <tdevries@suse.de>

	* plugin/cuda-lib.def: New file.  Factor out of ...
	* plugin/plugin-nvptx.c (CUDA_CALLS): ... here.
	(struct cuda_lib_s, init_cuda_lib): Include cuda-lib.def instead of
	using CUDA_CALLS.

From-SVN: r263208
2018-08-01 13:20:22 +00:00